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Description 

FIELD OF THE INVENTION 

5 [0001] This Invention is directed to the production of plants with a reduced susceptibility to virus infection. 
BACKGROUND OF THE INVENTION 

[0002] Plant viruses are responsible for major losses in worldwide crop production. I^uch effort is directed towards 
10 the development of new plant varieties which exhibit increased resistance to viral infection. Until recently such efforts 
were primarily based on the traditional plant breeding approach, however this approach Is often limited by a lack of 
sources of resistance within the crop species. The advent of modern molecular biology techniques has facilitated the 
development of new methods of rendering plant varieties resistant to virus attack that are not limited by a requirement 
for preexisting resistance genes within a species. 

- 15 

Molecular Approaches 

[0003] Many of these molecular approaches are based on the theory of pathogen derived resistance (Sanford and 
Johnston, 1 985). This theory predicts that a "normal" host (plant) - pathogen (virus) relationship can be disrupted if the 

20 host organism expresses essential pathogen derived genes. It has been proposed that host organisms expressing 
pathogen gene products in excess amounts, at an inappropriate developmental stage, or in a dysfunctional fomn may 
disrupt the normal replicative cycle of the pathogen and result in an attenuated or aborted Infection of the host. 
[0004] Two approaches typify this pathogen derived resistance: coat protein mediated resistance and antlsense RNA 
expression. It has been demonstrated that transgenic plants expressing a plant virus coat protein can be resistant to 

25 infection by the homologous virus. This coat protein mediated resistance has been demonstrated for several virus 
groups. While the mechanism of this resistance is not yet fully understood, It has been suggested that the presence 
of the plant synthesized coat protein prevents the removal of the protein coat (uncoating) of an invading virus and/or 
virus movement within the infected plant, leading to resistance: 

[0005] Plants which express an RNA molecule which is complementary to plus sense RNA species encoded by the 
30 virus may show a decreased susceptibility to Infection by that virus. Such a complementary RNA molecule Is termed 
antlsense RNA. It is thought that the plant encoded antlsense RNA binds to the viral RNA and thus inhibits Its function. 

Potyviruses 

35 [0006] The Potato Vims Y, or potyvirus, family represents a large number of plant viral pathogens which collectively 
can Infect most crop species including both monocotyledonous and dicotyledonous plants. Potyvims infection can 
induce a variety of symptoms including leaf mottling, seed and fruit distortion and can severely compromise crop yield 
and/or quality (Hollings and Brunt, 1981). 

[0007] Potyviruses have a single-strand plus sense RNA of circa 10,000 nucleotides which has a viral encoded 
40 protein linked to the 5' end and a 3' polyadenylate region. A single open reading frame codes for a 351 kDa polyprotein 
which is proteolytically processed Into mature viral gene products. The RNA is encapsidated by approximately 2,000 
copies of a coat protein monomer to form a virion. This capsid protein is encoded by the sequence present at the 3' 
end of the large open reading frame. 

[0008] Potyvlnjses can be transmitted by aphids and other sap feeding insects and In some Instances can also be 
45 transmitted in the seeds of infected plants. Replication of the viral RNA Is thought to occur In the cytoplasm of infected 
plant cells after uncoating. The replication mechanism involves both translation of the plus sense RNA to yield viral 
gene products (which include a repllcase and a proteinase) and also the synthesis of a.mlnus sense RNA strand. This 
minus sense strand then acts as a template for the synthesis of many plus sense genomes which are subsequently 
encapsidated In coat protein to yield infectious mature "virions", thus complete the replicative cycle of the virus. 
so [0009] Experiments have been reported in which transgenic plants expressing the coat protein gene of a potyviais 
show a reduced susceptibility to virus infection (Lawson et at. 1 990; Ling et al 1 991 ; Stari< and Beachy 1 989). 
[001 0] EP-A-024201 6 discloses the incorporation of genetic material, in particular cDNA corresonding to plant viral 
satellite RNA. into a plant such that, when the plant Is Infected by a plant virus, the expression of the Incorporated 
material modifies the plant virus or its effects. 
« [0011] WO-A-9213090 discloses a method for producing transgenic plants with reduced virus susceptibility. 
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SUMMARY OF THE INVENTION 

[0012] The disclosed invention concerns a method of producing plants with a decreased susceptibility to virus infec- 
tion. This is achieved by tr ansfomning plants w ith a DNA molecule which includes a gene derived in part from the 
g enome of ^ a plann^njs^ 

'"^^^'"^^^q^^a^wiF^^ this 

[0013] In particular, Invention provides an alternative and novel approach to rendering plants resistant to potyvirus 
infection. 

[001 4] Plants are transformed with a gene construct engineered to express an untranslatable tonn of the plus sense 
RNA which encodes the coat protein of a potyyirus. 

[001 5] In the case of Tobacco Etch Virus (TEV), it is demonstrated that tobacco plants transfonned with such a gene 
construct accumulate the untranslatable plus sense RNA but do not produce detectable levels of the coat protein. It is 
further shown that these plants are resistant to TEV infection. It Is also shown that tobacco cells expressing this un- 
translatable plus sense RNA do not support TEV replication, unlike control tobacco cells and also unlike tobacco cells 
which are engineered to express the plus sense translatable RNA and which, as a result, accumulate TEV coat protein. 
Although the exact mechanism is unknown, it is proposed that the untranslatable plus sense RNA inhibits viral repli- 
cation by binding to the minus sense RNA and preventing the minus sense RNA from functioning in the replication cycle. 
[001 6] It is believed that thiis approach will be applicable to other potyviruses, to genes other than the coat protein 
gene and to other plus sense RNA virus families. It is also believed that this means of inhibiting gene function Is 
applicable to other biological systems, Including mammalian vlmses. 

DESCRIPTION OF DRAWINGS 

[0017] 
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Fig. 1 represents the nucleotide sequence of the Tobacco Etch Virus genome and its deduced amino acid se- 
quence, according to Allison et al. (1 9B6). The nucleotide sequence of the plus sense strand of the DNA inserts 
is given. The first nucleotide (N) could not be determined unequivocally. The predicted amino acid sequence of 
the large ORF of reading frame three of the viron sense RNA is presented in the nucleotide sequence. This se- 
quence is also set forth in SEQ ID No. 1 of the enclosed sequence listing. The temiination codon at the end of the 
large ORF is marked with a The putative cleavage site between the large (54,000 Mw) nuclear inclusion protein 
and the capsid protein is indicated by the arrow. Oligonucleotide primer binding sites are underlined and labeled. 
Fig. 2 is a schematic representation of the construction of pTC:FL, utilized In construction of transfomnatlon vectors 
for the Invention. Restriction endonuclease sites were Introduced into pTL 37/8595 at positions A. B and C in the 
diagram. Following these nucleotide changes the mutated pTL 37/8595 was digested with the restriction enzyme 
Ncol, the DNA fragment delineated by the restriction enzyme sites at B and C was removed, and the plasmid 
rellgated to generate pTC:FL. pTC:FL contains the Tobacco Etch Virus (TEV) coat protein nucleotide sequence 
flanked by BarriHl restriction sites and the TEV 5' and 3' untranslated sequences (UTS). T7 and SP6 promoters 
are also shown. /M^breviatlons used in this diagram are as follows: T7, T7 RNA polymerase promoter sequence; 
SP6, SP6 RNA polymerase promoter sequence; ori, origin of replication; M13 on, bacteriophage M1 3 single-strand- 
ed origin of replication; amp^ p-lactamase gene. Lightly stippled areas are TEV 5* and 3' untranslated sequences; 
solid black area, TEV genome cDNA nucleotides 144 to 200; striped area, a portion of the TEV Nib gene (TEV nt 
8462-8517); heavily stippled areas, cDNA of TEV CP nucleotide sequence (TEV nt 8518-9309). 
Fig. 3 is a schematic representation of the forms of the Tobacco Etch Virus coat protein gene inserted into tobacco 
in the invention. All constructs contained the enhanced CaMV 35S (Enh 35S) promoter, CaMV 35S 5' untranslated 
sequence (UTS) of 50 bp and the CaMV 35S 3' UTS/polyadenylatlon site of 110 bp. The nomenclature used to 
describe the transgenic plant lines is presented along with the gene products produced in those plant lines (far 
right column). Abbreviations are as follows: 35S, transgenic plants containing the CaMV 35S promoter and 5' and 
3' UTS only; FL, transgenic plants containing the transgene coding for full-length, AS and RC transgenic plants 
contain the transgene expressed as an antisense fonm of the TEV CP gene, or an untranslated sense fomi of the 
TEV CP gene, respectively. Stippled areas represent various fornis of the TEV CP nucleotide sequence. 
Fig. 4 Is a graphic representation of the appearance of systemic symptoms In plants infected with Tobacco Etch 
Virus showing responses of control plants and transfomied plants generated as described in the invention. Ten 
B49 (wild type) plants and ten R2 plants of transgenic plant lines 35S #4. FL #3, FL #24, homozygous for the 
inserted TEV gene, were mechanically inoculated with 50 jil of 1 :1 0 dilution of infected plant sap (A). Twenty B49 
plants and 20 R1 plants of lines AS #3 and RC #5 were mechanically Inoculated with 50 \l\ of 5 jig/ml TEV (B). 
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Plants were examined daily for the appearance of systemic symptoms. Plants were evaluated daily, and any plant 
displaying systemic symptoms (attenuated or wild-type) were recorded as symptomatic. 

SEQUENCE LISTING 

5 

[0018] The attached sequence listing sets forth nucleotide sequences relevant to the present invention. 
[0019] SEQ ID No. 1 is the complementary DNA sequence con-esponding to the Tobacco Etch Virus Genome. 
[0020] SEQ ID No. 2 Is the nucleotide sequence of the modified Tobacco Etch Virus coat protein gene present in 
pTC:FL 

10 [0021] SEQ ID No. 3 is the nucleotide sequence of the modified Tobacco Etch Virus coat protein gene present in 
pTCiRC. 

[0022] SEQ ID No. 4 is the nucleotide sequence of the modified Tobacco Etch Virus coat protein gene present in 
pTC:AS. It is the inverse complement of SEQ ID No. 2. 

IS DETAILED DESCRIPTION 

[0023] The present invention relates to genetically engineered plants which are transformed with a DNA molecule 
encoding an untranslatable plus sense RNA molecule. 

20 Definition of Tertns 

[0024] Susceptible plant: A plant that supports viral replication and displays virus-induced symptoms. 
[0025] Resistant plant: A plant wherein virus-induced symptoms are attenuated and virus replication Is attenuated. 
[0026] Plus sense RNA (and sense RNA): That fonn of an RNA which can serve as messenger RNA. 
25 [0027] Minus sense RNA: That fomn of RNA used as a template for plus sense RNA production. 
[0028] Antisense RNA: RNA complementary to plus sense RNA fomn. 
[0029] Rq generation: Primary transfonmants. 
[0030] Ri generation: Progeny of primary transformants. 

[0031] R2 generation: Second generation progeny of R^ generation (i.e., progeny of R^ generation). 
30 [0032] A gene derived in part from a plant vims RNA molecule: At least the portion of the gene encoding the un- 
translatable RNA molecule is derived from a plant virus RNA molecule. 

GENERAL DESCRIPTION 
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[0033] Ar^^^^g ltabpii^ encoded by a gene located on the DNA molecule. The gene 
comprises UNA derived f^^ a plant virus RNA genome and also DNA from heterologous sources. The DNA from 
heterologous sources includes elements controlling the expr ession o f J,h,6 yjrus-d^ s^Q^snces 

s^oteri^iG^yTe^g^g^ls^]?^ 



I portion of the gene which comprises DNA from a plant viais has been derived from a 
potyvirus. Plants transformed with the DNA molecule containing the gene are less susceptible to infection by potyvi- 
ruses. Most specifically, the DNA from the potyvirus source has been derived from the coat protein gene of Tobacco 
Etch Vinjs and transfomried plants are resistant to infection by Tobacco Etch Virus. Plants which can be made resistant 
to potyvirus infection include, but are not limited to, tobacco. 

[0035] Accordingly, the present Invention provides a method for genetically engineering plants by insertion, into the 
plant genome, a DNA construct containing a recombinant gene derived from' a potyvirus genome such that the engi- 
neered plants display resistance to the potyvirus. 

[0036] In accordance with one aspect of the presen invention, genetically transfomried plants which are resistant to 
infection by a plant potyvirus are produced by inserting into the genome of the plant a DNA sequence which causes 
the production of an untranslatable coat protein RNA of the potyvirus. 

[0037] In accordance with another aspect of the present invention, a DNA sequence is provided to function In plant 
cells to cause the production of an untranslatable plus sense RNA molecule. There has also been provided, In accord- 
ance with yet another aspect of the present invention, bacterial and transfomried plant cells that contain the above- 
described DNA. In accordance with yet another aspect of the present invention, a differentiated tobacco plant has been 
provided that comprises transfomried tobacco cells which express the untranslatable coat protein RNA of Tobacco Etch 
Virus and which plants exhibit resistance to infection by Tobacco Etch Virus. 

[0038] A mechanism by which an untranslatable plus sense RNA molecule, such as described in the current invention 
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can function to inhibit the normal biological function of a minus sense RNA molecule is proposed. One skilled In the 
art will recognize that the novel approach described herein Is not limited to the specific experimental example given 
and will appreciate the wider potential utility of the invention. 

[0039] The expression of a plant gene which exists in double-stranded DN A f orm involves transcription of messenger 
5 RNA (mRNA) from one strand of the DNA by RNA polymerase enzyme, and the subsequent processing of the mRNA 
primary transcript inside the nucleus. This processing involves a 3' nontranslated region which causes polyadenylate 
nucleotides to be added to the 3' end of the viral RNA. Transcription of DNA into mRNA is regulated by a region of 
DNA usually referred to as the "promoter." The promoter region contains a sequence of bases that signals RNA polymer- 
ase to associate with the DNA and to initiate the transcription of mRNA using one of the DNA strands as a template 
^0 to make a corresponding strand of RNA. 

[0040] A number of promoters which are active in plant cells have been described in the literature. Promoters which 
are known or are found to cause transcription of viral RNA in plant celis can be used in the present invention. Such 
promoters may be obtained from plants or viruses and include, but are not limited to, the CaMV 35S promoter. As 
described below, it is preferred that the particular promoter selected should be capable of causing sufficient expression 
to result in the production of an effective amount of untranslatable plus sense RNA to render the plant substantially 
resistant to virus infection. The amount of untranslatable plus sense RNA needed to induce resistance may vary with 
the plant type. Accordingly, while the 35S promoter is preferred, it should be understood that this promoter may not be 
the optimal one for all embodiments of the present invention. Furthemnore, the promoters used in the DNA constructs 
of the invention may be modified, if desired, to affect their control characteristics. DNA sequences have been identified 
20 which confer regulatory specificity on promoter regions. For example, the small subunit of the ribulose bis-phosphate 
carboxylase (ss RUBISCO) gene is expressed in plant leaves but not In root tissues. A sequence motif that represses 
the expression of the ss RUBISCO gene in the absence of light, to create a promoter which is active in leaves but not 
In root tissue, has been identified. This and/or other regulatory sequence motifs may be ligated to promoters such as 
the CaMV 35S promoter to modify the expression patterns of a gene. Chimeric promoters so constructed may be used 
25 as described herein. For purposes of this description, the phrase "CaMV 358 promoter" will therefore include all pro- 
moters derived by means of ligation with operator regions, random or controlled mutagenesis, as well as tandem or 
multiple copies of enhancer elements, and the like. 

[0041] The 3' nontranslated region of genes which are known or are found to function as polyadenylation sites for 
viral RNA In plant cells can be used In the present invention. Such 3' nontranslated regions include, but are not limited 
30 to, the 3' transcribed, nontranslated region of the CaMV 35S gene and the 3* transcribed, nontranslated regions con- 
taining the polyadenylation signals of the tumor-inducing (Tl) genes of Agrobactehum, such as the tumor morphology 
large (tml) gene. For purposes of this description, the phrase "CaMV 35S 3' nontranslated region" will therefore Include 
all such appropriate 3' nontranslated regions. 

[0042] The DNA constnjcts of the disclosed embodiment contain, In double-stranded DNA f orni, a portion of a cDNA 

35 version of the single-stranded RNA genome of TEV. In potyvirus es. including TEV, the viral_g^n omjyj3c[!ide^ 
^encoding the coat protein, a 



ecu I e -^janH n h lb itWalWlpiicati^n?^ 
'FcogffilillTat^er portions of a potyvirus genome could be substituted for the coat protein gene. Furthermore, it will 
40 be apparent that suitable genomic portions are not limited to complete gene sequences. 

[0043] A disclosed embodiment of the Invention utilizes a double-stranded complementary DNA (cDNA) derived from 
the region of the TEV genome encoding the coat protein gene. To the 5' end of this cDNA is ligated the CaMV 35S 
promoter and CaMV 35S RNA 5' nontranslated region. To the 3' end is ligated the CaMV 35S 3' nontranslated region. 
These 5' and 3* sequences are present to cause transcription of the gene in plant cells by the cellular enzyme RNA 
polymerase to produce an RNA molecule of sequence corresponding to the sequence of the coat protein cDNA se- 
quence. Ordinarily, such an RNA would then be translated by ribosomes which would synthesize a protein of amino 
acid sequence specified by the nucleotide sequence of the RNA molecule. Particular amino acids are specified by 
nucleotide triplets tenned codons. Codons which stipulate translation initiation and tennination are also present In DNA 
and RNA sequences. The current Invention relates to RNA molecules which are untranslatable by ribosomes. In the 
preferred embodiment the sequence of the TEV cDNA encoding the coat protein is mutated by a standard in vitro 
mutagenesis technique to produce a frameshift mutation eariy in the coat protein structural gene immediately followed 
by three translation temnination signal codons. These mutations do not affect the ability of RNA polymerase to transcribe 
an RNA molecule from the cDNA but prevent translation of the transcribed RNA by ribosomes. Those skilled in the art 
will recognize that for the disclosed gene and for other genes, DNA sequences can be altered in other ways to cause 
55 the DNA to encode an untranslatable plus sense RNA molecule. Thus the disclosed Invention Is not limited to the 
mutations disclosed. 

[0044] A disclosed embodiment utilizes a cDNA encodirtg the coat protein gene of TEV, mutated so as to encode an 
untranslatable plus sense RNA. It will be obvious to one skilled in the art that further sequence alteration of the cDNA 
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molecule could be used to confer additional features on the untranslatable plus sense RNA molecule. Additional fea* 
tures include those which would result in increased viral resistance of plants transformed with the cDNA molecule 
encoding an untranslatable plus sense RNA. The Inclusion of a ribozyme sequence which causes the RNA catalyzed 
destruction of the target RNA molecule would constitute one such additional feature. Suitable ribozyme sequences are 

5 known, as discussed In Tabler and Tsagris (1 991 ). 

[0045] A DMA construct in accordance with the present invention is introduced, via a suitable vector and transfor- 
mation method as described below, into plant cells and plants transfonned with the introduced DNA are regenerated. 
Various methods exist for transforming plant cells and thereby generating transgenic plants. Methods which are known 
or are found to be suitable for creating stably transfonned plants can be used In this invention. The choice of method 

10 will vary with the type of plant to be transformed; those skilled in the art will recognize the suitability of particular methods 
for given plant types. Suitable methods may include, but are not limited to: electroporation of plant protoplasts; liposome 
mediated transformation; polyethylene mediated transfomnation; transformation using viruses; microinjection of plant 
cells; microprojectlle bombardment of plant cells and Agrobacterium tumefaciens (AT) mediated transfomnation. The 
latter technique is the method of choice for the disclosed preferred embodiment of the present invention. 

15 [0046] In an embodiment of the cun-ent invention, the DNA sequences comprising the CaMV 35S promoter and 
CaMV 35S nontranslated 3' region and the mutated cDNA encoding an untranslatable plus sense RNA derived from 
the TEV coat protein gene are combined in a single cloning vector. This vector is subsequently transfomned into AT 
cells and the resultant ceils are used to transfomi cultured tobacco cells. 

[0047] Vectors suitable for the AT mediated transfomnation of plants with the DNA of the invention are disclosed. It 
20 will be obvious to one skilled in the art that a range of suitable vectors is available, including those disclosed by Sevan 
(1 983). Herrera-Estrella (1983), Klee (1985) and EP-A-120516 (Schiiperoort et al.). Suitable vectors are available on 
a commercial basis from Clontech (Palo Alto, CA) and Phamiacia LKB (Pleasant Hill, CA) and other sources. 
[0048] Following the transfomnation of plant cells and regeneration of transfomied plants with the DNA molecules 
as described, regenerated plants are tested for increased virus resistance. Plants are preferably exposed to the virus 
25 at a concentration within a range where the rate of disease development correlates linearly with virus concentration. 
Methods for virus inoculation are well known to those skilled in the art and are reviewed by Kado and Agrawai (1 972). 
One such method includes abrading a leaf surface with an aqueous suspension containing an abrasive material such 
as carborundrum and virus or dusting leaves with such an abrasive material and subsequently applying the virus onto 
the leaf surface. A virus suspension can be directly inoculated into leaf veins or alternatively plants can be inoculated 
30 using insect vectors. The virus suspension may comprise purified vims particles, or altematlvely, sap from virus infected 
plants may be utilized. 

[0049] Transformed plants are then assessed for resistance to the virus. The assessment of resistance or reduced 
susceptibility may be manifest in different ways dependant on the particular virus type and plant type. Those skilled in 
the art will realize that a comparison of symptom development on a number of Inoculated untransformed plants with 
35 symptom development on similarly inoculated transfonned plants will provide a preferred method of detemnining the 
effects of transformation with the specified DNA molecule on plant resistance. Symptoms of infection include, but are 
not limited to leaf mottling, chlorosis and etching. Plants showing increased viral resistance may be recognized by 
delay in appearance of such symptoms or attenuation or total lack of such symptoms. 

^0 Example 

[0050] Work with tobacco plants and the Tobacco Etch Virus (TEV) is Illustrative of the invention. 
Construction of gene encoding untranslatable plus sense RNA molecule. 

45 

[0051] The Highly Aphid Transmissible (HAT) isolate of Tobacco Etch Virus (TEV) was obtained from Dr. Tom Pirone 
(University of Kentucky) and maintained in Nicotiana tabacum (Burley 21 ). The virus was purified from Nicotians tab- 
acum (Burley 21) 20 to 30 days following inoculation. Viral purification and RNA isolation procedures have been de- 
scribed (Dougherty and Hiebert (1980a). Complementary DNA (cDNA) was synthesized, made double-stranded and 

50 inserted Into the bacterial plasmid pBR322 as described by Allison et al. (1 985a, 1 985b, 1 986). cDNA synthesis was 
accomplished as follows: Purified viral RNA primed with oligo(dT^2-i8) served as a template for single-strand cDNA 
synthesis by reverse transcriptase. Following the addition of homopolymeric tracts of deoxycytidine 5' monophosphate, 
second-strand synthesis, primed with oligo(dG^2-i8)« was completed with DNA polymerase I. Sa/l and EcoRI linkers 
were ligated to the double-stranded cDNA and inserted into the bacterial plasmid pBR322 (Kurtz and NIcodemus 1 981 ). 

S5 The resulting cDNA clones were screened by colony hybridization (Hanahan and Meselson 1 980) with ollgo(dTi2.ie) 
primed, 32p.|abeled single-stranded TEV cDNA. Plasmid DNA was Isolated from colonies which hybridized with the 
probe, and the Sa/l/EcoRI cDNA inserts were sized by electrophoresis in a 0.8% (wA/) agarose gel using a horizontal 
water-cooled gel apparatus. 
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[0052] The Sa/l/£coRI inserts from the recombinant molecules were isolated from an agarose gel with NA45 mem- 
brane (Schlejcher & Schuell, Keene, NH) according to the manufacturer's protocol. The following restriction enzymes 
were used either alone or in combination to digest the isolated cDNA insert: HindlW, Xho\, AliA, Haelll, Rsa\, SaU3A, 
and Taq\. Restriction enzyme digestion products were inserted into the DNA of an appropriate M13 bacteriophage 
5 (Messing 1 983) seiecred for the presence of con-esponding polylinker restriction sites, and their nucleotide sequences 
were determined by dideoxy chain temnination. 

[0053] Plasmid pTL 37/8595 (Carrington and Dougherty 1987; Canington et al. 1987) contains a cDNA copy of the 
genomic sequence of HAT TEV corresponding to nucleotides (nt) 1 -200 and nt 8462-9495 (Fig. 2). (Numbering of the 
TEV genome nucleotides is according to that presented in Allison et al. 1 986). The nucleotide sequence and deduced 

10 amino acid sequence of the Tobacco Etch Virus genome and the numbering system utilized by Allison et al. (1986) 
and herein is shown in Fig. 1 and SEQ ID No. 1 in the attached sequence listing. The first and last codons of the coat 
protein (CP) coding region in the TEV genome are nt 8518-8520 (encoding the amino acid serine) and 9307-9309 
(opal stop codon) respectively. pTL 37/8595 was subject to in vitro site-directed mutagenesis as described by Taylor 
et al. (1 985a. 1 985b). In all cases, nucleotide changes were confirmed by dideoxy-nucleotide sequencing (Sanger et 

15 al.1977). 

[0054] TEV nt 9312-9317 were first mutated (Fig. 2) to generate a BamHI restriction site (GGATCC). TEV nt 
8516-8521 were then altered to generate an Ncol site (CCATGG), changing the first codon of the TEV CP coding 
region from AGT (Ser), to ATG (Met). A single oligonucleotide was then used to mutate TEV nt 133-138 to a BamHI 
restriction site (GGATCC), nt 143-148 to an Ncol restriction site (CCATGG) and nt 142 to a deoxyadenylate residue. 

20 These mutations generated an Ncol site centered on the first codon of the TEV ORF and in a good translational start 
context as described by Kozak (1984). Digestion of the resulting plasmid with the restriction enzyme Ncol; removing 
TEV nt # 143-200/8462-8516. and religation generated plasmid pTC:FL. pTCiFL contained only the TEV CP gene 
flanked by BamHI restriction sites and TEV 5' and 3' untranslated sequences (see Fig. 2). The nucleotide sequence 
of the TEV CP gene In pTC:FL produced by this mutagenesis scheme is shown in SEQ ID No. 2 in the attached 

25 sequence listing. 

[0055] Plasmid pTC:RC (RNA Control, producing untranslatable plus sense RNA) was generated by insertion of a 
single deoxythym idyl ate residue after TEV nt 8529, and point mutations of TEV nt 8522 (G to C), 8534 (C to A), 8542 
(Gto A), and 8543 (Ato G)to create aframeshift mutation immediately followed by three stop codons. An Nhe\ restriction 
site (GCTAGC) was simultaneously generated, for screening purposes, at nt 8539-8544. The nucleotide sequence of 
30 the TEV CP gene In pTC:RC produced by this mutagenesis scheme is shown in SEQ ID No. 3 in the attached sequence 
listing. 

[0056] All plasmids described above were linearized with H/ndlll. transcribed with T7 RNA polymerase (Melton et 
al. 1 984), and translated in a rabbit reticulocyte lysate containing Methionine (Dougherty and Hiebert 1980a). Ra- 
diolabeled translation products were analyzed by electrophoretic separation on a 12.5% acryiamide gel containing 
35 SDS (Laemmli 1 970) and detected by autoradiography. Transcripts of plasmid pTC:RC produced no detectable protein 
products, while transcripts from pTC:FL produced proteins of the expected sizes. 

[0057] The various forms of the CP nucleotide sequence were then inserted as BamHI cassettes into the plant ex- 
pression vector pPEV (see below and Fig. 3). 

[0058] The full length TEV CP open reading frame of pTC:FL was Inserted in the reverse orientation to make the 
40 antisense (AS) construct pTC:AS. The nucleotide sequence of the TEV CP gene in pTC:AS is shown in SEQ ID No. 
4 in the attached sequence listing. 

Transfonmation Vector Construction 

45 [0059] Construction of pPEV The vector pPEV is part of a binary vector system for Agrobacterium tumefaciens 
mediated plant cell transformation. Plasmid pPEV was constructed from the plasmids pCGN 2113 (Calgene), pCIB 
710 and pCIB 200 (Ciba Geigy Corp.). pCGN 2113 contains the "enhanced" Cauliflower Mosaic Virus (Cal^V) 35S 
promoter (CaMV sequences -941 to 90/-363 to +2, relative to the transcription start site) in a pUC derived plasmid 
backbone. pCIB 710 has been described (Rothstein et al. 1987) and pCIB 200 Is a derivative of the wide host range 

50 plasmid pTJS 75 (Schmldhauser and Helinski 1 985) which contains left and right A tumefaciens T37 DNA borders, 
the plant selectable NOS/NPT II chimeric gene from the plasmid Bin 6 (Bevan 1 984) and part of a pUC polylinker. The 
small EcoRl-EcoRV DNA fragment of pClB 71 0 (Rothstein et al. 1 987) was ligated into £coRI-EcoRV digested pCGN 
21 1 3. This regenerated the enhanced CaMV 35S promoter (Kay et al, 1 987) of pCGN 21 1 3 and introduced the CaMV 
35S 5' and 3* untranslated sequences into pCGN 2113. The CaMV 35S promotertemiinator cassette of the resulting 

S5 plasmid was isolated as an EcoR\'Xba\ DNA fragment and ligated into EcoRI-Xbal digested pCIB 200 to generate 
pPEV. CP nucleotide sequences from PTC:FL, pTC:RC, and pTC:AS were cloned as BamHI cassettes into BamHI 
digested pPEV and orientation of inserts confirmed by digestion with appropriate restriction endonucleases. 
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Transformation and Regeneration of Tobaoco 

[0060] pPE V plasmids containing TEV CP ORFs were mobilized from £. coll HB1 01 Into A tumetaciens A1 36 con- 
taining plasmidpClB542 (Ciba Geigy), using the helper plasmidpRK 201 3 in E. co//HB101 and the tri-parental mating 
system of DItta et ai. (1 980). Piasmid pCIB 42 supplied Wr functions necessary for T-DNA transfer. 
[0061] Leaf discs of Nicotiana tabacum cv Burley 49 were transformed and whole plants regenerated according to 
Horsch et al. (1985). Transfomied tissue was selected by culturing callus on MS plates (Murashige and Sl^oog 1962) 
containing 1 \l g/ml 6-benzylaminopurine (Sigma Corp.), 01 p,g/ml a-naphthaleneacetic acid (Sigma Corp.), 500 ^g/ml 
carbenicillin and 100 jig/ml Kanamycin sulfate (Sigma Corp.). Shoots were rooted on MS plates containing 500 \igfm\ 
carbenicillin and 100 p.g/ml kanamycin sulfate, and plantlets were transplanted Into soil and transfen-ed directly into 
the greenhouse approximately 2-3 weeks after rooting. 

[0062] RO, R1 and R2 generation plants were screened by western and/or northern blot analyses. R2 seed (ca. 1 00 
seeds per R2 plant) was screened for the kanamycin -resistant phenotype (kanO by surface sterilizing seed in 1 0% 
bleach for 5 min., washing twice in sterile water and gemninating on MS plates containing 1 00 ^g/ml kanamycin sulfate. 
R2 seed lines which were 100% kanamycin resistant were screened by western blot analysis for expression of TEV 
coat protein. Those transgenic plant lines generated and their nomenclature are presented in Fig. 3. 

Molecular Analyses of Transgenic Plants 

[0063] Transgenic tobacco plants were analyzed by western and northern blot analyses to determine the nature of 
protein and RNA products produced respectively. Total RNA samples isolated from the various transgenic lines were 
analyzed in northem blot hybridization studies. Total nucleic acids were isolated from tissue and RNA precipitated with 
LiCl as described by Verwoerd et al. (1 989). RNAs were electrophoretlcally separated on 1 .2% agarose gels containing 
6% (vA^) formaldehyde and transferred to nitrocellulose. Prehybridlzation and hybridization conditions were as de- 
scribed in Sambrook et al. (1989). Strand specific riboprobes were generated from SP6 orT7 DNA dependent RNA 
polymerase transcription reactions of pTL 37/8595 linearized with the restriction enzymes Asp718 (Boehringer Man- 
nheim. Indianapolis, IN) or H/ndll. respectively, using a-labelled 32p-cTP ribonucleotide and suggested procedures 
(Promega, Madison, Wl). 

[0064] An RNA transcript of approximately 1 ,000 nt was expected with alt transgenic plant lines. Such a TEV CP 
transcript was detected in CP expressing plant lines by using a minus sense riboprobe containing the TEV CP sequence. 
A similar transcript was detected in AS plants by using a plus sense riboprobe containing the TEV CP sequence. The 
transcript in the RC line, while detected with a minus sense riboprobe, may have migrated as a slightly larger (ca 
1 ,1 00-1 ,200 nt) RNA species, possibly due to termination at an alternately selected site and/or a longer poly-A tail on 
the transcript. Differing levels of CP transcript accumulation were observed among different transgenic plant lines. 
Transgenic plant lines expressing the coat protein of TEV were identified by western blot analysis using polyclonal 
antisera to TEV CP. Tissue samples of regenerated plants were ground in 10 volumes of 2X Laemmli (Tris-glicine) 
runner buffer (Laemmli 1 970) and clarified by centrif ugation in a microcentrifuge for 1 0 min. at 1 0,000xg. Protein con- 
centration was estimated by the dye binding procedure of Bradford (1976) using BSA as a standard. Protein samples 
(50 ^ig total protein) were separated on a 12.5% polyacrylamide gel containing SDS and subjected to the immunoblot 
transfer procedures described by Towbin et al. (1979). AntirTEV coat protein polyclonal primary antibodies, alkaline 
phosphatase conjugated secondary antibodies and the chromogenic substrates NBT (para-nitro blue tetrazolium chlo- 
ride) and BCIP (5-bromo-4-chloro-3-indoyl phosphate para-toluidine salt) were used to detect bound antigen, 
[0065] Coat protein products produced In FL plants were stable and accumulated to different levels in individual 
transgenic plant lines. It was estimated by westem blot analysis that between 0.01 % to 0.001% of total extracted protein 
was TEV CR • 

Assessment of Resistance to TEV 

[0066] Eight-week-old (circa 15 cm tall) R1 and R2 plants were inoculated with either purified virus preparations or 
Infected plant sap. Inoculum was applied witii sterile, premoistened cotton swabs. Infected plant sap inoculum was 
prepared by grinding TEV-infected N, tabacum Buriey 21 leaf tissue (2 weeks postinoculation) in carborundum and 50 
mM sodium phosphate buffer (pH 7.8) at a ratio of 1gm:02gm:10mls, respectively, and filtering the homogenate through 
cheesecloth. TEV virons were purified as described by Dougherty and Hiebert (1 980b). One leaf per plant was dusted 
lightly with carborundum (320 grit) and inoculated at two interveinal locations with 50 \x\ (total) of inoculum. Inoculated 
plants were examined daily and the appearance and severity of systemic symptoms recorded. Symptoms on any leaf 
above the Inoculated leaf were considered to be systemic. 

[0067] Typically, inoculation of Buriey 49 plants with TEV (either purified virus or plant sap) resulted In severe chlorosis 
and mosaic and mottle on systemically infected leaves approximately 6-7 days after Inoculation. Severe etching of the 
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leaf followed within a few days. It was observed that transgenic plants containing only the CaMV promoter and un- 
translated sequences (i.e., 35S plant line) responded to challenge inoculation in a manner similar to wild type Burley 
49, developing extensive chlorosis and etching at the same rate (Fig. 4A). Plant lines which expressed FL TEV CP 
showed little or no delay in the appearance of symptoms when inoculated with Infected plant sap. However, FL trans- 
5 genie plants did show a slight attenuation of symptoms and eventually (2-4 weeks after initial appearance of symptoms), 
younger leaf tissue emerged devoid of symptoms and virus as demonstrated by back inoculation experiments. Typically 
chlorosis and etching on older systemic leaves was limited. 

[0068] Ten Independently transfonned RC lines and seven independently transformed AS lines were obtained. Prog- 
eny from three of the RC lines, including line RC #5 and from one of the AS lines, including AS #3, showed an altered 
10 response to viral infection relative to control plants. All of these lines were verified to be transformed and were producing 
expected RNA products. A possible explanation forthe variation In observed phenotype Is the previously noted "position 
effect" whereby the expression of genes from identical DNA sequences integrated at different locations within the 
genome show varying patterns of tissue specificity. 

[0069] Ten R2 expressing plants of the FL expressing line were Inoculated with infected plant sap, and 20 R1 plants 
15 of lines AS #3 and RC #5 were inoculated with 50 p.1 of a 5 ^tg/ml solution of purified TEV. Identical results to those 
obtained by purified TEV inoculation were obtained when AS #3 and RC #5 R1 plants were inoculated with TEV-infected 
plant sap, as described above. 

[0070] Transgenic Burley 49 plant lines AS #3 and RC #5, expressing only TEV CP related RNA sequences, showed 
a delay in the appearance of symptoms and a modification of symptoms when inoculated with TEV (Fig. 4B). Since 

20 the 20 R1 plants were not screened for expression of CP RNA prior to inoculation, some of the symptomatic plants 
represented non-expressing plants in which the gene of interest had been lost during Mendelian segregation. Modified 
symptoms on AS #3 plants appeared as small chlorotic lesions often associated with a vein. Most of the leaves were 
devoid of symptoms and virus (detennined by back inoculation experiments). Approximately 15% of RC #5 plants 
showed symptoms which were identical to those of infected Buriey 49. However, the remaining RC #5 plants were 

25 entirely asymptomatic, and virus was not detected in back inoculation studies. 

[0071] Plants from TEV resistant AS and RC lines showed no increased resistance, relative to untransfonned con- 
trols, to infection by two other members of the potyvirus family, namely Tobacco Vein Mottling Vims and Potato Virus Y. 
[0072] Rg generation plants derived from TEV-resistant RC plants showed the expected Mendelian pattern of inher- 
itance of the TEV-resistant phenotype. 

30 

Analysis of TEV Replication in Protoplasts Derived from Transgenic Plant Lines 

[0073] In an attempt to explain the results obtained when AS and RC transgenic plants were challenged with TEV, 
it was sought to detemnine if all of the transgenic plant lines would support virus replication at a level comparable to 

35 Buriey 49. Accumulation of viral encoded proteins was used as an indirect indicator of viral replication. Protoplasts 
were derived from leaf tissue of homozygous CP expressing plants and electroporated according to the procedure of 
Luciano et al. (1 987) with TEV RNA. Protoplasts were prepared from transgente plants and electroporated according 
to the procedure of Luciano et al. (1 987). Protoplasts (1 X 1 0^) were resuspended In 450 p.1 electroporation buffer (330 
mM mannltol, 1 mM KPO4 pH 7.0, 150 mM KCI) and electroporated using a BTX Transferor 300 (BTX San Diego, 

40 CA) (950 micro Farads, 1 30-volt pulse amplitude, 3.5 mm electrode gap) in the presence or absence of 6 \Lg of purified 
TEV RNA. After electroporation, protoplasts were incubated for 96 hours in incubation medium as described in Luciano 
et al. (1 987). Protoplasts were extracted in 2X Laemmli (Trisglycine) running buffer, and 5 x 10* extracted protoplasts 
were then subjected to western blot analysis as described above. Protoplast viability was measured by dye exclusion 
as described in Luciano et al. (1 987). All electroporated protoplast samples had equivalent viability counts. The results 

"^5 indicated that protoplasts from all FL plant lines supported virus replication at levels comparable to wild type Buriey 
49 protoplasts. R1 transgenic plants from lines AS #3 and RC #5 were initially screened by northem analysis, and 
leaves from positive expressors were used in the production of protoplasts. Transfected protoplasts derived from AS 
#3 plants supported TEV replication, albeit at a reduced level. Protoplasts derived from RC #5 transgenic plant leaf 
tissue did not support TEV replication at a detectable level. These results, and those presented in the whole plant 

50 Inoculation series, suggested AS and RC plants interfere with TEV replication. 

Discussion of Data 

[0074] The above example Indicates that varying degrees of protection from TEV infection can be achieved by over- 
55 expression of coat protein and by expression of an antisense RNA. The cun-ent invention which comprises the expres- 
sion of an untranslatable plus sense RNA molecule provides protection against TEV infection that is more effective 
than either of these two methods. Plants of line RC #5, transformed with the disclosed DNA molecule encoding an 
untranslatable plus sense RNA derived from the TEV coat protein gene, were asymptomatic and appear to be com- 
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pletely protected from virus infection. The disclosed Invention therefore represents a new and effective way of gener- 
ating potyvlrus resistant gemnplasm. 

[0075] Tobacco protoplasts derived from plants expressing the antlsense RNA supported a reduced level of TEV 
replication compared to control cells derived from untransformed plants. In contrast, tobacco protoplasts derived from 
plants of line RC #5, expressing the untranslatable plus sense RNA did not support detectable TEV replication. This 
suggests that the untranslatable plus sense RNA was more effective at blocking TEV replication in the cells of those 
transformed plants tested. 

[0076] It is proposed that the untranslatable plus sense RNA inhibits viral replication by hybridizing to the minus 
sense RNA replicative template of TEV. The finding that plants expressing untranslatable plus sense RNA derived from 
the TEV coat protein gene are not protected from infection by Potato Virus Y or Tobacco Vein Mottling Virus is therefore 
explained by the circa 40-50% amino acid sequence divergence between the coat proteins of these viruses and TEV 
(Allison et al. 1 986; Robaglia et al. 1989; Domier et al. 1 986). 

[0077] From the above-described findings, it would be reasonable and entirely predictable that if plants were trans- 
fomrted with a gene encoding an untranslatable plus sense RNA derived from a gene which was highly conserved 
between viruses of the potyvlrus family, that these plants would be protected from infection by a wide range of viruses. 
Regions of the potyvlrus genome which are sufficiently conserved between potyvlrus types to be potentially useful In 
such an approach may be readily determined by one sicilled in the art. Highly conserved regions may be determined 
by reference to published sequence data (Allison et al. 1986; Robaglia et al. 1989; Domier et al. 1986; Lain etal. 1989; 
Maiss et al. 1 989). The utility of the identified regions could be readily determined using the methodologies described 
above and substituting the defined region for the TEV coat protein gene. 

[0078] Regions of the potyvlrus genome potentially suitable include, but are not limited to the genes encoding the 
viral replicase and the viral proteinase, Furthemriore, it will be apparent to one skilled in the art that highly conserved 
portions of a particular gene may also serve in this role. 

[0079] It will also be apparent to one skilled in the art that the described invention may also be used to produce plants 
resistant to viruses outside of the potyvirus family in instances where these viruses also produce a minus sense RNA 
replicative template. 
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SEQUENCE LISTING 

[0119] 

(1) GENERAL INFORMATION: 

(i) APPLICANT William G. Dougherty and John A. LIndbo 

(il) TITLE OF INVENTION: Production of Plants Showing Immunity to Viral Infection via Introduction of Genes 
Encoding Untranslatable Plus Sense RNA Molecules 

(iii) NUMBER OF SEQUENCES: 4 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Richard J. Polley 

(B) STREET: One World Trade Center 121 S.W. Salmon Street, Suite 1600 

(C) CITY: Portland 

(D) STATE: Oregon 

(E) COUNTRY: United States of America 

(F) ZIP: 97204 

(V) COMPUTER READABLE FOFiM: 

(A) MEDIUM TYPE: Diskette, 5.25 inch 

(B) COMPUTER: IBM PC Compatible 

(C) OPERATING SYSTEM: MS DOS 

(D) SOFTWARE: WordPerfect 5.1 
(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 07/838,509 

(B) FILING DATE: February 19, 1992 

(C) CLASSIFICATION: 435 
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(vi) PRIOR APPLICATION DATA: None 
(vli) ATTORNEY/AGENT INFORMATION 

(A) NAME: Richard J. Polley, Esq. 

(B) REGISTRATION NUMBER: 28.1 07 

(C) REFERENCE/DOCKET NUMBER: 245-35829/RJP 
(vli!) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (503) 226-7391 

(B) TELEFAX: (503) 228-9446 
(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9495 

(B) P»'PE: Nucleic Acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: Linear 

(II) MOLECULE TYPE: 

(A) DESCRIPTION: cDNA to genonnte RNA 

(lil) HYPOTHETICAL: No 

(Iv) ANTI-SENSE: No 

(V) FRAGMENT TYPE: N/A 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Tobacco Etch Virus (TEV) 

(B) STRAIN: Highly Aphid Transmitted (HAT) 

(vi() IMMEDIATE SOURCE: TEV propagated In N. tabacum Burley 49 
(vili) POSITION IN GENOME: N/A 
(ix) FEATURE: 

(A) NAME/KEY: Coat protein gene 

(B) LOCATION: Genomic nucleotides 8518-9306 

(C) IDENTIFICATION METHOD: - 

(D) OTHER INFORMATION: SEQ. ID No. 1 is the cDNA corresponding to the Tobacco Etch Virus Genome. 
(X) PUBLICATION INFORMATION: 
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(A) AUTHORS: Allison et al. 

(B) TITLE: The nucleotide sequence of the coding region of Tobacco Etch Virus Genomic RNA: Evidence 
for the Synthesis of a Single Poiyprotein 

5 

(C) JOURNAL: Virology 

(D) VOLUME: 154 
10 (E) ISSUE: " 

(F) PAGES: 9-20 
(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 1 : 



15 



20 



25 



30 



NAAATAACAA ATCTCAACAC AACATATACA AAACAAACGA ATCTCAAGCA ATCAAGCATT 60 

CTACTTCTAT TGCAGCAATT TAAATCATTT CTTTTAAAGC AAAAGCAATT TTCTGAAAAT 120 

TTTCACCATT TACGAACGAT AGCA ATG GCA CTG ATC TTT GGC ACA GTC AAC GCT 174 

Met Ala Leu He Phe Gly Thr Val Asn Ala 
15 10 

AAC ATC CTG AAG GAA GTG TTC GGT GGA GCT CGT ATG GCT TGC GTT ACC 222 
Asn He Leu Lys Glu Val Phe Gly Gly Ala Arg Met Ala Cys Val Thr 
15 20 25 



35 



40 



45 



50 



55 
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ACC GCA CAT ATG GCT CGA GCC AAT GCA ACC ATT TTG AAG AAC GCA GAA 
Ser Ala His Met Ala Gly Ala Asn Gly Ser lie Leu Lye Lys Ala Glu 
30 35 40 

GAG ACC TCT CGT GCA ATC ATG CAC AAA CCA GT6 ATC TTC 6GA GAA CAC 
Glu Thr Ser Arg Ala lie Met His Lys Pro Val lie Phe Gly Glu Asp 
45 50 55 

TAC ATT ACC GAG CCA GAC TTC CCT TAG ACA CCA CTC CAT TTA GAG GTC 
Tyr He Thr Glu Ala Asp Leu Pro Tyr Thr Pro Leu His Leu Glu Val 
60 65 70 

GAT GCT GAA ATG GAG CGG ATG TAT TAT CTT GGT CGT CGC GCG CTC ACC 
Asp Ala Glu Met Glu Arg Met Tyr Tyr Leu Gly Arg Arg Ala Leu Thr 
75 BO 85 90 

CAT GGC AAG AGA CGC AAA GTT TCT GTG AAT AAC AAG AGG AAC AGG AGA 
His Gly Lys Arg Arg Lys Val Ser Val Asn Asn Lya Arg Aan Arg Arg 
95 100 105 

AGG AAA GTG GCC AAA ACG TAC GTG GGG CGT GAT TCC ATT GTT GAG AAG 
Arg Lys Val Ala Lys Thr Tyr Val Gly Arg Asp Ser He Val Glu Lys 
110 115 120 

ATT GTA GTG CCC CAC ACC GAG AGA AAG GTT GAT ACC ACA GCA GCA GTG 
He Val Val Pro His Thr Glu Arg Lys Val Asp Thr Thr Ala Ala Val 
125 130 135 

GAA GAC ATT TGC AAT GAA GCT ACC ACT CAA CTT GTG CAT AAT ACT ATG 
Glu Asp He Cys Asn Glu Ala Thr Thr Gin Leu Val His Asn Ser Met 
140 14S 150 

CCA AAG CGT AAG AAG CAG AAA AAC TTC TTG CCC GCC ACT TCA CTA ACT 
Pro Lys Arg Lya Lys Gin Lya Asn Phe Leu Pro Ala Thr Ser Leu Ser 
155 160 165 170 

AAC GTG TAT GCC CAA ACT TGG AGC ATA GTG CGC AAA CGC CAT ATG CAG 
Asn Val Tyr Ala Gin Thr Trp Ser He Val Arg Lya Arg His Met Gin 
175 180 185 

GTG GAG ATC ATT AGC AAG AAG AGC GTC CGA GCG AGG GTC AAG AGA TTT 
Val Glu He He Ser Lys Lys Ser Val Arg Ala Arg Val Lys Arg Phe 
190 195 200 

GAG GGC TCG GTG CAA TTC TTC GCA AGT GTG CGT CAC ATG TAT GGC GAC 
Glu Gly Ser Val Gin Leu Phe Ala Ser Val Arg His Met Tyr Gly Glu 
205 210 215 

AGG AAA AGG GTG GAC TTA CGT ATT GAC AAC TGG CAG CAA GAG ACA CTT 
Arg Lys Arg Val Aap Leu Arg He Asp Asn Trp Gin Clh Glu Thr Leu 
220 225 230 

CTA GAC CTT CCT AAA AGA TTT AAG AAT GAG AGA GTG GAT CAA TCG AAG 
Leu. -Asp Leu Ala Lys Arg Phe Lys Asn Glu Arg Val Asp Gin Ser Lys 
235 240 245 250 

CTC ACT TTT GGT TCA AGT GGC CTA GTT TTG AGG CAA GGC TCG TAC GGA 
Leu Thr Phe Gly Ser Ser Gly Leu Val Leu Arg Gin Gly Ser Tyr Gly 
255 260 265 

CCT GCG CAT TGG TAT CGA CAT GGT ATG TTC ATT GTA CGC GCT CGG TCG 
Pro Ala His Trp Tyr Arg His Gly Met Phe He Val Arg Gly Arg Ser 
270 275 280 

GAT GGG ATG TTG GTG GAT GCT CGT GCG AAG GTA ACG TTC GCT CTT TGT 
Asp Gly Met Leu Val Asp Ala Arg Ala Lys Val Thr Phe Ala Val Cys 
285 290 295 
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CAC TCA ATC ACA CAT TAT ACC CAC AAA TCA ATC TCT GAG GCA TTC TTC 1086 
HiB Ser Met Thr His Tyr Ser Aap Lya Ser He Ser Glu Ala Phe Phe 
300 305 310 

^ ATA CCA TAG TCT AAG AAA TTC TTG GAG TTG AGA CCA GAT GGA ATC TCC 1134 

lie Pro Tyr Ser Lya Lya Phe Leu Glu Leu Arg Pro Asp Gly He Ser 
320 325 330 

CAT GAG TGT ACA ACA GGA GTA TCA GTT GAG CGG TGC GGT GAG GTG GCT 1182 
His Glu Cya Thr Arg Gly Val Ser Val Glu Arg Cys Gly Glu Val Ala 
335 340 345 

CCA ATC CTG ACA CAA GCA CTT TCA CCG TGT GGT AAG ATC ACA TGC AAA 1230 
Ala Zle Leu Thr Gin Ala Leu Ser Pro Cya Gly Lye He Thr Cys Lys 
350 355 360 

IS CCT TCC ATG GTT GAA ACA CCT CAC ATT GTT GAG GGT GAG TCG GGA GAA 1278 

Arg Cys Met Val Glu Thr Pro Asp He Val Glu Gly Glu Ser Gly Glu 
365 370 375 

AGT GTC ACC AAC CAA GGT AAG CTC CTA CCA ATG CTG AAA GAA CAG TAT 1326 
Ser Val Thr Asn Gin Gly Lys Leu Leu Ala Met Leu Lys Glu Gin Tyr 
20 380 385 390 

CCA GAT TTC CCA ATG GCC GAG AAA CTA CTC ACA AGG TTT TTG CAA CAG 1374 
Pro Aap Phe Pro Met Ala Glu Lys Leu Leu Thr Arg Phe Leu Gin Gin 
395 400 405 410 

25 AAT ACA AAT TTG ACA GCC TCC GTG AGC GTC AAA CAA 1422 

Lya Ser Leu Val Asn Thr Asn Leu Thr Ala Cys Val Ser Val Lys Gin 
415 420 425 

CTC ATT GGT GAC CGC AAA CAA GCT CCA TTC ACA CAC GTA CTG GCT GTC 1470 
Lou He Gly Asp Arg Lys Gin Ala Pro Phe Thr His Val Leu Ala Val 
30 430 435 440 

AGC GAA ATT CTG TTT AAA GGC AAT AAA CTA ACA GGG GCT GAT CTC GAA 1518 
Ser Glu He Leu Phe Lys Gly Aen Lys Leu Thr Gly Ala Asp Leu Glu 
445 450 455 

GAG GCA AGC ACA CAT ATG CTT GAA ATA GCA AGG TTC TTG AAC AAT CGC 1566 
Ala Ser Thr His Met Leu Glu He Ala Arg Phe Leu Asn Asn Arg 
460 465 470 

ACT GAA AAT ATG CGC ATT GGC CAC CTT GGT TCT TTC AGA AAT AAA ATC 1614 

Thr Glu Asn Met Arg He Gly His Leu Gly Ser Phe Arg Asn Lys He 

475 480 485 490 

40 

TCA TCG AAG GCC CAT GTG AAT AAC GCA CTC ATG TGT GAT AAT CAA CTT 1662 

Ser Ser Lys Ala His Val Asn Asn Ala Leu Met Cys Asp Asn Gin Leu 
495 500 505 

GAT CAC AAT GGG AAT TTT ATT TGG GCA CTA AGG GCT CCA CAC GCA AAG 1710 
Asp Gin Asn Gly Asn Phe He Trp Gly Leu Arg Gly Ala His Ala Lya 
510 515 520 

AGG TTT CTT AAA GGA TtT TTC ACT GAG ATT GAC CCA AAT GAA GGA TAC 1758 
Arg Phe Leu Lya Gly Phe Phe Thr Glu He Asp Pro Asn Glu Gly Tyr 
525 530 535 

50 

GAT AAC TAT CTT ATC AGC AAA CAT ATC AGG GGT AGC AGA AAG CTA GCA 1806 
Asp Lys Tyr Val He Arg Lys His Ho Arg Gly Ser Arg Lys Leu Ala 
540 545 550 

ATT GGC AAT TTG ATA ATG TCA ACT GAC TTC CAG ACG CTC AGG CAA CAA 1854 
55 Asn Leu He Met Ser Thr Aap Phe Gin Thr Leu Arg Gin Gin 

555 560 565 570 
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ATT CAA GGC GAA ACT ATT GAG CGT AAA GAA ATT GGC AAT CAC TGC ATT 
lie 61x1 Gly Glu Thr He Glu Arg Lya Olu He Gly Asn Hie Cys He 
575 5B0 585 

TCA ATG CGG AAT GGT AAT TAG GTG TAG CCA TGT TGT TGT GTT ACT CTT 

Ser Met Arg Asn Gly Asn Tyr Val Tyr Pro cys Cys Cys Val Thr Leu 
590 595 600 

GAA GAT GGT AAG GCT CAA TAT TCG GAT CTA AAG CAC CCA ACG AAC AGA 
Glu Asp Gly Lya Ala Gin Tyr Ser Asp Leu Lys His Pro Thr Lys Arg 
605 610 615 

CAT CTG GTC ATT GGC AAC TCT GGC GAT TCA AAG TAC CTA GAC CTT CCA 

His Leu Val He Gly Asn Ser Gly Aap Ser Lys Tyr Leu Asp Leu Pro 
620 625 630 

GTT CTC AAT GAA GAG AAA ATG TAT ATA GCT AAT GAA GGT TAT TGC TAC 
Val Leu Asn Glu Glu Lys Met Tyr He Ala Asn Glu Gly Tyr Cys Tyr 
635 640 645 650 

ATG AAC ATT TTC TTT GCT CTA CTA GTG AAT GTC AAG GAA GAG GAT GCA 
Met Asn He Phe Phe Ala Leu Leu Val Asn Val Lys Glu Glu Asp Ala 
655 660 665 

AAG GAC TTC ACC AAG TTT ATA AGC GAC ACA ATT GTT CCA AAG CTT 6GA 
Lys Asp Phe Thr Lya Phe He Arg Asp Thr He Val Pro Lys Leu Gly 
670 675 660 

GCG TGG CCA ACA ATG CAA GAT GTT GCA ACT GCA TGC TAC TTA CTT TCC 
Ala Trp Pro Thr Met Gin Asp Val Ala Thr Ala Cys Tyr Leu Leu Ser 
6B5 690 695 

ATT CTT TAC CCA CAT GTC CTG ACA GCT GAA CTA CCC AGA ATT TTG GTT 
He Leu Tyr Pro Asp Val Leu Arg Ala Glu Leu Pro Arg He Leu Val 
700 705 710 

GAT CAT GAC AAC AAA ACA ATG CAT GTT TTG GAT TCG TAT GGC TCT AGA 
Asp His Asp Asn Lya Thr Met His Val Leu Asp Ser Tyr Gly Ser Arg 
715 720 725 730 

ACG ACA GGA TAC CAC ATG TTG AAA ATG AAC ACA ACA TCC CAG CTA ATT 
Thr Thr Gly Tyr His .Met Leu Lys Met Asn Thr Thr Ser Gin Leu He 
735 740 745 

GAA TTC GTT CAT TCA GGT TTG GAA TCC GAA ATG AAA ACT TAC AAT GTT 
Glu Phe Val His Ser Gly Leu Glu Ser Glu Met Lys Thr Tyr Asn Val 
750 755 760 

GGA GCG ATG AAC CGA GAT GTG GTC ACA CAA GGT GCA ATT GAG ATG TTG 
Gly Gly Met Asn Arg Asp Val Val Thr Gin Gly Ala He Glu Met Leu 
765 770 775 

ATC AAG TCT ATA TAC AAA CCA CAT CTC ATG AAG CAG TTA CTT GAG GAA 
He Lys Ser He Tyr Lys Pro His Leu Met Lys Gin Leu Leu Glu Glu 
780 785 790 

GAG CCA TAC ATA ATT GTC CTG GCA ATA GTC TCC CCT TCA ATT TTA ATT 
Glu Pro Tyr He He Val Leu Ala He Val Ser Pro Ser He Leu He 
795 800 805 810 

GCC ATG TAC AAC TCT GGA ACT TTT GAC CAG GCG TTA CAA ATG TGG TTG 
Ala Met Tyr Asn ser Gly Thr Phe Glu Gin Ala Leu Gin Met Trp Leu 
815 620 625 

CCA AAT ACA ATG AGG TTA GCT AAC CTC GCT CCC ATC TTG TCA GCC TTA 
Pro Asn Thr Met Arg. Leu Ala Asn Leu Ala Ala He Leu Ser Ala Leu 
830 835 840 
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GCG CAA AAG TTA ACT TTC GCA GAT TTG TTC GTC CAG CAG CGT AAT TTG 2718 
Ala Gin LyB Leu Thr Leu Ala Asp Leu Phe Val Gin Gin Arg Asn Leu 

645 BSD 855 

5 ATT AAT GAG TAT GCG CAG GTA ATT TTG GAC AAT CTG ATT GAC GCT GTC 2766 

lie Asn Glu Tyr Ala Gin Val He Leu Aap Asn Leu He hap Gly Val 

860 .865 870 

AGG CTT AAT CAT TCG CTA TCC CTA GCA ATG GAA ATT GTT ACT ATT AAG 2814 
Arg Val Asn Hia Ser Leu Ser Leu Ala Met Glu He Val Thr Zle Lys 
10 875 880 885 890 

CTG GCC ACC CAA GAG ATG GAC ATG GCG TTG AGG GAA GGT G6C TAT GCT 2862 
Leu Ala Thr Gin Glu Met Asp Met Ala Leu Arg Glu Gly Gly. Tyr Ala 
895 900 905 

15 ACC TCT GAA AAG GTG CAT GAA ATG TTG GAA AAA AAC TAT GTA AAG 2910 

Val Thr Ser Glu Lya Val His Glu Met Leu Glu Lys Asn Tyr Val Lys 
910 915 920 

GCT TTG AAG GAT GCA TGG GAC GAA TTA ACT TGG TTG GAA AAA TTC TCC 2958 
Ala Leu Lys Asp Ala Trp Asp Glu Leu Thr Trp Leu Glu Lys Phe Ser 
20 925 930 935 

GCA ATC AGG CAT TCA AGA AAG CTC TTG AAA TTT GGG CGA AAG CCT TTA 3006 
Ala He Arg His Ser Arg Lys Leu Leu Lys Phe Gly Arg Lys Pro Leu 
940 945 

25 ATC ATG AAA AAC ACC GTA GAT TCC GGC GGA CAT ATA GAC TTG TCT CTG 3054 

He Met Lys Asn Thr Val Asp Cys Gly Gly His He Asp Leu Ser Val 
955 960 965 970 

AAA TCC CTT TTC AAG TTC CAC TTG GAA CTC CTG AAG GGA ACC ATC TCA 3102 
Lys Ser Leu Phe Lys Phe His Leu Glu Leu Leu Lys Gly Thr He Ser 
3^ 975 980 985 

AGA GCC GTA AAT GGT GGC GCA AGA AAG GTA AGA GTA GCG AAC AAT GCC 3150 
Arg Ala Val Asn Gly Gly Ala Arg Lys Val Arg Val Ala Lys Asn Ala 
990 995 1000 



35 



ATG ACA AAA GGG GTT TTT CTC AAA ATC TAC AGC ATG CTT CCT CAC GTC 3198 
Met Thr Lys cly Val Phe Leu Lys He Tyr Ser Met Leu Pro Asp val 
lOOS 1010 1015 

TAC AAG TTT ATC ACA GTC TCG AGT GTC CTT TCC TTG TTG TTG ACA TTC 3246 

Tyr Lys Phe He Thr Val Ser Ser Val Leu Ser Leu Leu Leu Thr Phe 
1020 1025 1030 

40 

TTA TTT CAA ATT GAC TGC ATG ATA AGG GCA CAC CGA GAG GCG AAG GTT 3294 

Leu Phe Gin He Asp Cys Met He Arg Ala His Arg Glu Ala Lys Val 
i035 1040 1045 1050 

GCT GCA CAG TTG CAG AAA GAG AGC GAG TGG GAC AAT ATC ATC AAT AGA 3342 
45 Ala Ala Gin Leu Gin Lys Glu Ser Glu Trp Asp Asn He He Asn Arg 

1055 1060 1065 

ACT TTC CAG TAT TCT AAG CTT GAA AAT CCT ATT GGC TAT CGC TCT ACA 3390 
Thr Phe Gin Tyr Ser Lys Leu Glu Asn Pro He Gly Tyr Arg Ser Thr 
1070 1075 1080 

SO 

•GCG GAG GAA AGA CTC CAA TCA GAA CAC CCC GAG GCT TTC GAG TAC TAC 3438 
Ala Glu Glu Arg Leu Gin Ser Glu His Pro Glu Ala Phe Glu Tyr Tyr 
1085 1090 1095 

AAG TTT TGC ATT GCA AAG GAA GAC CTC GTT GAA CAC GCA AAA CAA CCG 3486 
55 Cys He Gly Lys Glu Asp Leu Val Glu Gin Ala Lys Gin Pro 

1100 1105 1110 
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GAG ATA GCA TAC TTT GAA AAG ATT ATA GCT TTC ATC ACA CTT GTA TTA 3534 
Glu He Ala Tyr Phe Clu Lya lie He Ala Phe lie Thr Leu Val Leu 
1115 1120 1125 1130 

5 ATC GCT TTT GAC GCT GAG CGC AGT GAT 6GA 6TG TTC AAG ATA CTC AAT 3582 

Met Ala Phe Asp Ala Glu Arg Ser Asp Gly Val Phe LyB He Leu Asn 
1135 1140 1145 

AAG TTC AAA GGA ATA CTG AGC TCA ACG GAG AGG GAG ATC ATC TAC ACG 3630 
Lyfl Phe Lye Gly He Leu Ser Ser Thr Glu Arg Glu He He Tyr Thr 
10 1150 1155 1160 

CAG AGT TTG GAT GAT TAC GTT ACA ACC TTT GAT GAC AAT ATG ACA ATC 3678 
Gin Ser Leu Asp Asp Tyr Val Thr Thr Phe Asp Asp Asn Met Thr He 
1165 1170 1175 

AAC CTC GAG TTC AAT ATG GAT GAA CTC CAC AAG ACG AGC CTT OCT GGA 3726 
Aan Leu Glu Leu Aen Met Asp Glu Leu His Lys Thr Ser Leu Pro Gly 
1180 1185 1190 

GTC ACT TTT AAG CAA TGG TGG AAC AAC CAA ATC AGC CGA 6GC AAC CTG 3774 
Val Thr Phe Lys Gin Trp Trpi Aen Asn Gin He Ser Arg Gly Asn Val 
1155 1200 1205 1210 

AAG CCA CAT TAT AGA ACT GAG GGG CAC TTC ATG GAG TTT ACC AGA GAT 3822 
Lys Pro His Tyr Arg Thr Glu Gly His Phe Met Glu Phe Thr Arg Asp 
1215 1220 1225 

ACT CCG GCA TCG.CTT GCC AGC GAG ATA TCA CAC TCA CCC GCA AGA CAT 3870 
Thr Ala Ala Ser Val Ala Ser Glu He Ser His Ser Pro Ala Arg Asp 
1230 1235 1240 

TTT CTT OTG AGA CGT GCT GTT GGA TCT GGA AAA TCC ACA GGA CTT CCA 3918 
Phe Leu Val Arg Gly Ala Val Gly Ser Gly Lys Ser Thr Gly Leu Pro 
1245 1250 1255 

30 

TAC CAT TTA TCA AAG AGA GGG ACA GTG TTA ATG CTT GAG CCT ACC AGA 3966 
Tyr His Leu Ser Lys Arg Gly Arg Val Leu Met Leu Glu Pro Thr Arg 
1260 1265 1270 

CCA CTC ACA GAT AAC ATG CAC AAG CAA CTG ACA AGT GAA CCA TTT AAC 4014 
35 Pro Leu Thr Asp Asn Met Hia Lys Gin Leu Arg Ser Glu Pro Phe Asn 

1275 1280 1285 1290 

TGC TTC CCA ACT TTG AGG ATG AGA GGG AAG TCA ACT TTT GGG TCA TCA 4062 
Cys Phe Pro Thr Leu Arg Met Arg Gly Lys Ser Thr Phe Gly Ser Ser 
1295 1300 1305 



40 



45 



SO 



55 



CCG ATC ACA CTC ATC ACT AGT GGA TTC CCT TTA CAC CAC TTT CCA CCA 4110 
Pro He Thr Val Met Thr Ser Gly Phe Ala Leu His His Phe Ala Arg 
1310 1315 1320 

AAC ATA GCT GAG GTA AAA ACA TAC GAT TTT CTC ATA ATT GAT GAA TGT 4158 
Asn He Ala Clu Val Lys Thr Tyr Asp Phe Val He He Asp Clu Cys 
1325 1330 1335 

CAT GTG AAT GAT GCT TCT GCT ATA GCG TTT AGG AAT GTA CTG TTT GAA 4206 
His Val Asn Asp Ala Ser Ala He Ala Phe Arg Asn Leu Leu Phe Glu 
1340 1345 1350 

CAT GAA TTT CAA GGA AAA GTC CTC AAA GTG TCA GCC ACA CCA CCA GGT 4254 
His Glu Phe Clu Gly Lys Val Leu Lys Val Ser Ala Thr Pro Pro Gly 
1355 1360 1365 1370 

AGA GAA GTT GAA TTT ACA ACT CAG TTT CCC GTG AAA CTC AAG ATA GAA 4302 
Arg Glu val Glu Phe Thr Thr Gin Phe Pro Val Lye Leu Lys He Glu 
1375 1380 1385 
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GAG CCT CTT AGC TTT CAG GAA TTT GTA ACT TTA CAA GGG ACA GGT GCC 4350 
Glu Ala. Leu Ser Phe Gin Glu Phe Val Ser Leu Gin Gly Thr Gly Ala 
1390 1395 1400 

5 AAC GCC GAT GTG ATT AGT TGT CGC GAC AAC ATA CTA GTA TAT GTT GCT 4398 

Aan Ala Asp VaX lie Ser Cya Gly Asp Aan lie Leu val Tyr Val Ala 
1405 1410 1415 

AGC TAG AAT GAT GTT GAT AGT CTT GGC AA6 CTC CTT GTG CAA AAG G6A 4446 
Ser Tyr Asn Asp Val Asp Ser Leu Gly Lys Leu Leu Val Gin Lya Gly 
10 1420 1425 1430 

TAG AAA GTG TCG AAG ATT CAT GGA AGA ACA ATG AAG AGT GGA GGA ACT 4494 
Tyr Lys Val Ser Lya lie Aap Gly Arg Thr Met Lya Ser Gly Gly Thr 
1435 1440 1445 1450 

- ,5 GAA ATA ATC ACT GAA GGT ACT TCA GTG AAA AAG CAT TTC ATA GTC GCA 4542 

Glu He He Thr Glu Gly Thr Ser Val Lya Lya Hia Phe He Val Ala 
1455 1460 1465 

ACT AAC ATT ATT GAG AAT GGT GTA ACC ATT GAC ATT GAT GTA GTI GTG 4590 
Thr Asn He He Glu Aan Gly Val Thr He Asp He Asp Val Val Val 
20 1470 1475 14B0 

GAT TTT GGG ACT AAG GTT GTA CCA GTT TTG GAT GTG GAC AAT AGA GCG 4636 
Asp Phe Gly Thr Lys Val Val Pro Val Leu Asp Val Asp Asn Arg Ala 
14B1 1490 1495 

GTG CAG TAC AAC AAA ACT GTG GTG AGT TAT GGG GAG CGC ATC CAA AAA 46B6 
Val Gin Tyr Asn Lys Thr val val Ser Tyr Gly Glu Arg He Gin Lys 
1500 1505 1510 

CTC GCT AGA GTT GCC CCA CAC AAG GAA GGA GTA CCA CTT CCA ATT GGC 4734 
Leu Gly Arg Val Gly Arg His Lys Glu Gly Val Ala Leu Arg He Gly 
1515 1520 1525 1530 

CAA ACA AAT AAA ACA CTG GTT GAA ATT CCA GAA ATG GTT GCC ACT GAA 4782 
Gin Thr Asn Lys Thr Leu Val Glu He Pro Glu Met Val Ala Thr Glu 
1535 1540 1545 

GCT GCC TTT CTA TGC TTC ATG TAC AAT TTG CCA GTG ACA ACA CAG AGT 4830 
35 Ala Ala Phe Leu Cya Phe Met Tyr Asn Leu Pro Val Thr Thr Gin Ser 

1550 1555 .1560 

GTT TCA ACC ACA CTG CTG GAA AAT GCC ACA TTA TTA CAA GCT AGA ACT 4878 
Val Ser Thr Thr Leu Leu Glu Aan Ala Thr Leu Leu Gin Ala Arg Thr 
1565 1570 1575 



25 



30 



40 



50 



55 



ATG GCA CAG TTT GAG CTA TCA TAT TTT TAC ACA ATT AAT TTT GTG CGA 4926 
Met Ala Gin Phe Glu Leu Ser Tyr Phe Tyr Thr He Asn Phe Val Arg 
1580 1585 1590 

TTT GAT GCT AGT ATC CAT CCA GTC ATA CAT GAC AAG CTG AAG CGC TTT 4974 
Phe Asp Gly Ser Met His Pro Val He Hia Asp Lya Leu Lya Arg Phe 
1595 1600 1605 1610 

AAG CTA CAC ACT TGT GAG ACA TTC CTC AAT AAG TTG GCG ATC CCA AAT 5022 
Lys Leu His Thr Cya Glu Thr Phe Leu Aan Lya Leu Ala He Pro Asn 
1615 1620 1625 

AAA GGC TTA TCC TCT TGG CTT ACG AGT GGA GAG TAT AAG CGA CTT GGT 5070 
Lys Gly Leu Ser Ser Tr? Leu Thr Ser Gly Glu Tyr Lys Arg Leu Gly 
1630 1635 1640 

TAC ATA GCA GAG GAT GCT GCC ATA AGA ATC CCA TTC GTG TGC AAA GAA 5118 
Tyr He Ala Glu Asp Kla Gly He Arg He Pro Phe Val Cys Lys Glu 
1645 1650 1655 
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ATT CCA GAC TCC TTG CAT GAG GAA ATT TGG CAC ATT GTA GTC GCC CAT 5166 
He Pro ABp Ser Leu His GIu Glu He Trp Hie He Val Val Ala His 
1660 1665 1670 

5 AAA GGT GAC TCG GGT ATT 6GG AGG CTC ACT AGC GTA CAG GCA GCA AAG 5214 

Lya Gly Asp ser Gly He Gly Arg Leu Thr Ser V&l Gin Ala Ala Lys 
1675 1680 1685 1690 

GTT GTT TAT ACT CTC CAA ACG GAT GTG CAC TCA ATT GCG AGG ACT CTA 5262 
Val Val Tyr Thr Leu Gin Thr Aop Val His Ser He Ala Arg Thr Leu 
10 1695 1700 1705 

GCA TGC ATC AAT AGA CGC ATA GCA GAT GAA CAA ATG AAG CAG AGT CAT 5310 

Ala Cys He Asn Arg Arg He Ala Asp Glu Gin Hec Lys Gin ser His 
1710 1715 1720 

is TTT GAA GCC CCA ACT CGC AGA CCA TTT TCC TTC ACA AAT TAG TCA ATA 5358 

Phe Glu Ala Ala Thr Gly Arg Ala Phe Ser Phe Thr Asn Tyr Ser He 
1725 1730 1735 

CAA AGC ATA TTT GAC ACG CTG AAA GCA AAT TAT GOT ACA AAG CAT ACG 5406 
Gin Ser He Phe Asp Thr Leu Lya Ala Asn Tyr Ala Thr Lys His Thr 
20 1740 1745 1750 

AAA GAA AAT ATT GCA GTG CTT CAG CAG GCA AAA GAT CAA TTG CTA GAG 5454 
Lys Glu Asn He Ala Val Leu Gin Gin Ala Lys Asp Gin Leu Leu Glu 
1755 1760 1765 1770 

TTT TCG AAC CTA GCA AAG GAT CAA GAT GTC ACG GGT ATC ATC CAA GAC 5502 
Phe Ser Asn Leu Ala Lys Asp Gin Asp Val Thr Gly He He Gin Asp 
1775 1780 1785 

TTC AAT CAC CTG GAA ACT ATC TAT CTC CAA TCA GAT AGC GAA GTG GCT 5550 
Phe Asn His Leu Glu Thr He Tyr Leu Gin Ser Asp Ser Glu Val Ala 
1790 1795 1800 

AAG CAT CTG AAG CTT AAA AGT CAC TGG AAT AAA AGC CAA ATC ACT AGG 5598 
Lya His Leu Lys Leu Lys Ser His Trp Asn Lys Ser Gin He Thr Arg 
1805 1810 1815 

GAC ATC ATA ATA GCT TTG TCT GTG TTA ATT GGT GGT GGA TGG ATG CTT 5646 
Asp He He He Ala Leu Ser Val Leu He Gly Gly Gly Trp Met Leu 
1820 1825 1830 

GCA ACG TAC TTC AAG GAC AAG TTC AAT CAA CCA CTC TAT TTC CAA GGG 5694 
Ala Thr Tyr Phe Lys Asp Lys Phe Asn Glu Pro Val Tyr Phe Gin Gly 
1835 1840 1845 1850 

AAG AAG AAT CAG AAG CAC AAC CTT AAG ATC AGA GAG GCG CGT GCG GCT 5742 
Lys Lys Asn Gin Lys His Lys Leu Lys Met Arg Glu Ala Arg Gly Ala 
1855 I860 1865 

AGA GGG CAA TAT GAG CTT GCA GCG GAG CCA CAG GCG CTA GAA CAT TAC 5790 
Arg Gly Gin Tyr Glu Val Ala Ala Glu Pro Glu Ala Leu Glu His Tyr 
1870 1875 1680 

TTT GGA AGC GCA TAT AAT AAC AAA GGA AAG CGC AAG GGC ACC ACG AGA 5838 
Phe Gly Ser Ala Tyr Asn Asn Lys Gly Lys Arg Lys Gly Thr Thr Arg 
1685 1890 1895 

50 

GGA ATG GGT GCA AAG TCT CGG AAA TTC ATA AAC ATG TAT GGG TTT GAT 5886 
Gly Met Gly Ala Lys Ser Arg Lys Phe He Asn Met Tyr Gly Phe Asp 
1900 1905 1910 

CCA ACT GAT TTT TCA TAC ATT AGG TTT GTG GAT CCA TTG ACA GGT CAC 5934 
55 Pro Thr Asp Phe Ser Tyr He Arg Phe Val Asp Pro Leu Thr Gly His 

1915 1920 1925 1930 



25 



30 



35 



40 
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ACT 

Thr 


ATT 
He 


GAT GAG TCC.ACA 
Asp Glu Ser Thr 
1935 


AAC 
Asn 


GCA CCT ATT GAT 
Ala Pro He Asp 
1940 


TTA 
Leu 


GTG 
val 


CAG 
Gin 


CAT GAG 
His Glu 
1945 


5982 


5 


TTT 
Phe 


GGA 
Gly 


AAG 
Lys 


GTT AGA 
Val Arg 
1950 


ACA 

Thr 


CGC 
Arg 


ATG 
Met 


TTA ATT 
Leu He 
1955 


GAC 
Asp 


GAT 
Asp 


GAG 
Glu 


ATA GAG 
He Glu 
1960 


CCT 
Pro 


6030 


10 


CAA 
Gin 


AGT 

Ser 


CTT AGC 
Leu Ser 
1965 


ACC 
Thr 


CAC 
His 


ACC 
Thr 


ACA ATC 
Thr He 
1970 


CAT 
His 


GCT 

Ala 


TAT 
Tyr 


TTG GTG 

Leu Val 
1975 


AAT 
Asn 


AGT 

Ser 


6078 




GGC 
Gly 


ACG AAG 
Thr Lys 
1980 


AAA 
Lys 


GTT 
Val 


CTT 
Leu 


AAG GTT 
Lys Val 
1985 


GAT 
Asp 


TTA 
Leu 


ACA 
Thr 


CCA CAC 
Pro His 
1990 


TCG 
Ser 


TCG 

Ser 


CTA 
Leu ; 


6126 


15 


C6T GCG 
Arg Ala 
1995 


AGT 
Ser 


GAG 
Glu 


AAA 
Lys 


TCA ACA 

Ser Thr 
2000 


GCA 
Ala 


ATA 
He 


ATG 
Met 


GGA TTT 
Gly Phe 
2005 


CCT 
Pro 


GAA 

Glu 


AGG 

Arg 


GAG 
Glu 
2010 


6174 


20 


AAT 
Asn 


GAA 
Glu 


TTG 

Leu 


CGT 
Arg 


CAA ACC 

Gin Thr 
2015 


GGC 
Gly 


ATG 
Met 


GCA 
Ala 


GTG CCA 
Val Pro 
2020 


GTG 

val 


GCT 

Ala 


TAT 
Tyr 


GAT CAA 

Asp Gin 
2025 


6222 




TTG 
Leu 


CCA 
Pro 


CCA 
Pro 


AAG TIAT 
Lys Asn 
2030 


GAG 
Glu 


GAC 
Asp 


TTG 
Leu 


ACG TTT 
Thr Phe 
2035 


GAA 
Glu 


GGA 

Gly 


GAA 
Glu 


AGC TTG 
Ser Leu 
2040 


TTT 
Phe 


6270 


25 


AAG 
Lys 


GGA 
Gly 


CCA CCT 
Pro Arg 
2045 


GAT 
Asp 


TAG 
Tyr 


AAC 
Asn 


CCG ATA 
Pro He 
2050 


TCG 
Ser 


AGC 
Ser 


ACC 
Thr 


ATT TGT 
He Cys 
2055 


CAT 
His 


TTG 
Leu 


6318 


30 


ACG 
Thr 


AAT GAA 
Asn Glu 
2060 


TCT 
Ser 


GAT 
Asp 


GGG 
Gly 


CAC ACA 

His Thr 
2065 


ACA 
Thr 


TCG 
Ser 


TTC 
Leu 


TAT GGT 
Tyr Gly 
2070 


ATT 
He 


GGA 
Gly 


TTT 
Phe 


6366 




GGT CCC 
Gly Pro 
2075 


TTC 
Phe 


ATC 
He 


ATT 
He 


ACA AAC 
Thr Asn 
2080 


AAG 
Lys 


CAC 
His 


TTG 
Leu 


TTT AGA 
Phe Arg 
2085 


AGA 
Arg 


AAT 
Asn 


AAT 
Asn 


GGA 
Gly 
2090 


6414 


35 


ACA 
Thr 


CTG 
Leu 


TTC 
Leu 


GTC 
Val 


CAA TCA 
Gin Ser 
2095 


CTA 
Leu 


CAT 
His 


CCT 
Gly 


GTA TTC 
Val Phe 
2100 


AAG 
Lys 


GTC 

Val 


AAG 
Lys 


AAC ACC 
Asn Thr 
2105 


6462 


40 


ACG 
Thr 


ACT 
Thr 


TTG 
Leu 


CAA CAA 

Gin Gin 
2110 


CAC 
His 


CTC 
Leu 


ATT 
He 


GAT GGG 
Asp Gly 
2115 


AGG 
Arg 


GAC 
Asp 


ATG 
Met 


ATA ATT 
He He 
2120 


ATT 
He 


6510 




C6C 
Arg 


ATG 
Met 


CCT AAG 
Pro Lys 
2125 


GAT 
Asp 


TTC 
Phe 


CCA 
Pro 


CCA .TTT 
Pro Phe 
2130 


CCT 
Pro 


CAA 
Gin 


AAG 
Lys 


CTG AAA 
Leu Lys 
2135 


TTT 
Phe 


AGA 
Arg 


6558 


45 


GAG 
Glu 


CCA CAA 

Pro Gin 
2140 


AGG 
Arg 


GAA 
Glu 


GAG 
Glu 


CGC ATA TGT 
Arg He Cys 
2145 


CTT 
Leu 


GTG 
Val 


ACA ACC 
Thr Thr 
2150 


AAC 
Asn 


TTC 
Phe 


CAA 
Gin 


6606 


50 


ACT AAG 
Thr Lys 
2155 


AGC 
Ser 


ATG 
Met 


TCT 
ser 


AGC ATG 
Ser Met 
2160 


GTG 
Val 


TCA 
Ser 


GAC 
Asp 


ACT AGT 
Thr Ser 
2165 


TGC 
Cys 


ACA 
Thr 


TTC 
Phe 


CCT 
Pro 
2170 


6654 




TCA 
Ser 


TCT 
Ser 


GAT 
Asp 


GGC 
Gly 


ATA TTC TGG AAG CAT 
He Phe Trp Lys His 
2175 


TGG ATT CAA ACC AAG 
Trp He Gin Thr Lys 
2180 


GAT GGG 
Asp Gly 
2185 


6702 




CAG 


TGT 


GGC 


AGT 


CCA 


TTA 


GTA 


TCA ACT 


AGA 


GAT 


GGG 


TTC ATT 


GTT 


GGT 


6750 



Gin Cys Gly ser Pro Leu Val Ser Thr Arg Asp Gly Phe He Val Gly 
2190 2195 2200 
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ATA CAC TCA GCA TCG AAT TTC ACC AAC ACA AAC AAT TAT TTC ACA ACC 6798 
lie His Ser Ala Ser Asn Phe Thr Asn Thr Asn Asn Tyr Phe Thr Ser 
2205 2210 2215 

GTG CCC AAA AAC TTC ATG GAA TTG TTG ACA AAT CAG GAG GCG CAG CAG 6B46 
Val Pro Lya Asn Phe Met Glu Leu Leu Thr Asn Gin Glu Ala Gin Gin 
2220 . 2225 2230 

TGG GTT AGT GGT TGG CGA TTA AAT GCT GAG TCA GTA TTG TGG GGG GCC 6894 
Trp Val Ser Gly Trp Arg Leu Asn Ala Asp Ser Val Leu Trp Gly Gly 
2235 2240 2245 2250 

CAT AAA GTT TTC ATG AGC AAA CCT GAA GAG CCT TTT CAG CCA GTT AAG 6942 
His Lys Val Phe Met Ser Lys Pro Glu Glu Pro Phe Gin Pro Val Lys 
2255 2260 2265 

GAA GCG ACT CAA CTC ATG AAT GAA TTG GTG TAC TCG CAA GGG GAG AAG 6990 
Glu Ala Thr Gin Leu Met Asn Glu Leu Val Tyr Ser Gin Gly Glu Lys 
2270 2275 2280 

AGG AAA TGG GTC GTG GAA GCA CTG TCA GGG AAC TTG AGG CCA GTG GCT 7038 
Arg Lys Trp Val Val Glu Ala Leu Ser Gly Asn Leu Arg Pro Val Ala 
2285 2290 2295 

GAG TGT CCC AGT CAG TTA GTC ACA AAG CAT GTG GTT AAA GGA AAG TCT 7086 
Glu cys Pro Ser Gin Leu Val Thr Lys His Val Val Lys Gly Lys Cya 
2300 2305 2310 

CCC CTC TTT GAG CTC TAC TTG CAG TTG AAT CCA GAA AAG GAA GCA TAT 7134 
Pro Leu Phe Glu Leu Tyr Leu Gin Leu Asn Pro Glu Lys Glu Ala Tyr 
2315 2320 2325 2330 

TTT AAA CCG ATG ATG GGA GCA TAT AAC CCA AGT CGA CTT AAT A6A GAG 7182 
Phe Lys Pro Met Met Gly Ala Tyr Lys Pro Ser Arg Leu Asn Arg Glu 
2335 2340 2345 

GCC TTC CTC AAG GAC ATT CTA AAA TAT OCT AGT GAA ATT GAG ATT GGG 7230 
Ala Phe Leu Lys Asp He Leu Lys Tyr Ala Ser Glu He Glu He Gly 
2350 2355 2360 

AAT GTC GAT TGT GAC TTG CTG GAG CTT GCA ATA AGC ATG CTC GTC ACA 7278 
35 Asn yal Asp Cys Asp Leu Leu Glu Leu Ala He Ser Met Leu Val Thr 

2365 2370 2375 

AAG CTC AAG GCG TTA CCA TTC CCA ACT GTG AAC TAC ATC ACT GAC CCA 7326 
Lys Leu Lys Ala Leu Gly Phe Pro Thr Val Asn Tyr He Thr Asp Pro 
2380 2385 2390 

40 

GAG GAA ATT TTT AGT GCA TTG AAT ATG AAA GCA GCT ATG GGA GCA CTA 7374 
Glu Glu He Phe Ser Ala Leu Asn Met Lys Ala Ala Met Gly Ala Leu 
2395 2400 2405 2410 

TAC AAA GGC AAG AAG AAA GAA GCT CTC AGC GAG CTC ACA CTA GAT GAG 7422 
4S Xyr Lya Gly Lya Lye Lys Glu Ala Leu Ser Glu Leu Thr Leu Asp Glu 

2415 2420 2425 

CAG GAG GCA ATG CTC AAA GCA AGT TGC CTG CGA CTG TAT ACG GGA AAG 7470 
Gin Glu Ala Met Leu Lys Ala Ser Cys Leu Arg Leu Tyr Thr Gly Lys 
2430 2435 2440 



20 



25 



30 



50 



55 



TTG GGA ATT TGG AAT GGC TCA TTG AAA GCA GAG TTG CGT CCA ATT GAG 7518 
Leu Gly He Trp Asn Gly Ser Leu Lys Ala Glu Leu Arg Pro He Glu 
2445 2450 2455 

AAG GTT GAA AAC AAC AAA ACG CGA ACT TTC ACA GCA GCA CCA ATA GAC 7566 
Lys Val Glu Asn Asn Lys Thr Arg Thr Phe Thr Ala Ala Pro He Asp 
2460 2465 2470 
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ACT CTT CTT GCT GGT AAA GTT TGC GTG GAT GAT TTC AAC AAT CAA TTT 7614 
Thr Leu Leu Ala Gly Lys Val Cys Val Asp Asp Phe Asn Asn Gin Phe 
2475 2480 2485 2490 

5 TAT GAT CTC AAC ATA AAG GCA CCA TGG ACA GTT GGT ATG ACT AAG TTT 7682 

Tyr Aap- Leu Asn lie Lya Ala Pro Trp Thr Val Gly Met Thr Lys Phe 
2495 . 2500 2505 

TAT GAG GGG TGG AAT GAA TTC ATG GAG GCT TTA CCA ACT GGG TGG GTG 7710 
Tyr Gin Gly Trp Asn Glu Leu Met Glu Ala Leu Pro Ser Gly Trp Val 
10 2510 2515 2520 

TAT TGT CAC GCT GAT GCT TCG CAA TTC GAC AGT TCC TT6 ACT CCA TTC 775B 
Tyr Cys Asp Ala Asp Gly Ser Gin Phe Asp Ser Ser Leu Thr- Pro Phe 
2525 2530 2535 

15 CTC ATT AAT GCT GTA TTG AAA GTG CGA CTT GCC TTC ATG GAC GAA TGG 7806 

Leu He Asn Ala Val Leu Lya Val Arg Leu Ala Phe Met Glu Glu Trp 
2540 2545 2550 

GAT ATT GGT GAG CAA ATG CTG CGA AAT TTG TAG ACT GAG ATA GTG TAT 7854 
Asp lie Gly Glu Gin Met Leu Arg Asn Leu Tyr Thr Glu He Val Tyr 
20 2555 2560 2565 2570 

ACA CCA ATC CTC ACA CCG GAT GGT ACT ATC ATT AAG AAG CAT AAA GGC 7902 
Thr Pro He Leu Thr Pro Asp Cly Thr He He Lys Lys His Lys Gly 
2575 2580 2585 

25 AAC AAT ACC GCC CAA CCT TCA ACA GTG GTG GAC AAC ACA CTC ATC CTC 7950 

Asn Asn Ser Gly Gin Pro Ser Thr Val Val Asp Asn Thr Leu Met Val 
2590 2595 2600 

ATT ATT GCA ATG TTA TAG ACA TGT GAG AAG TGT GGA ATC AAC AAG GAA 7998 
He He Ala Met Leu Tyr Thr Cys Glu Lye Cys Gly He Aan Lys Glu 
3^ 2605 2610 2615 

GAG ATT GTG TAT TAC GTC AAT GCC GAT GAC CTA TTG ATT GCC ATT CAC 8046 
Glu He Val Tyr Tyr Val Asn Gly Asp Asp Leu Leu He Ala He His 
2620 2625 2630 

CCA GAT AAA CCT GAG AGC TTG AGT ACA TTC AAA GAA TCT TTC GGA GAG - 8094 
Pro Asp Lys Ala Glu Arg Leu Ser Arg Phe Lys Glu Ser Phe Gly Glu 
2835 2640 2645 2650 

TTG GGC CTG AAA TAT GAA TTT GAC TGT ACC ACC AGC GAC AAG ACA CAG 8142 
Leu Gly Leu Lys Tyr Glu Phe Asp Cys Thr Thr Arg Asp Lys Thr Gin 
2655 2660 2665 

TTG TGC TTC ATG TCA CAC ACG GCT TTG GAG AGC GAT GGC ATG TAT ATA 8190 
Leu Trp Phe Met Ser His Arg Ala Leu Glu Arg Asp Gly Met Tyr He 
2670 2675 2680 

CCA AAG CTA GAA GAA GAA AGG ATT GTT TCT ATT TTG GAA TGG GAC AGA 8238 
Pro Lys Lou Glu Glu Glu Arg He Val Ser He Leu Glu Trp Asp Arg 
2685 2690 2695 

TCC AAA GAG CCG TCA CAT AGG CTT GAA GCC ATC TGT GCA TCA ATG ATT 8286 
Ser Lys Glu Pro Ser His Arg Leu Glu Ala He Cys Ala Ser Met He 
2700 2705 2710 

GAA GCA TCC GGT TAT CAC AAG CTC GTT CAA GAA ATC CGC AAT TTC TAT 8334 
Glu Ala Trp Gly Tyr Asp Lys Leu Val Glu Glu He Arg Asn Phe Tyr 
2715 2720 2725 2730 

GCA TGG GTT TTG GAA CAA GCC CCG TAT TCA CAG CTT GCA GAA GAA GGA 8382 
55 ^'P Val Leu Glu Gin Ala Pro Tyr Ser Gin Leu Ala Glu Glu Cly 

2735 2740 2745 



35 



40 



SO 
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AAG GCG CCA TAT CTG GCT GAG ACT GCG CTT AAG TTT TTG TAC ACA TCT 8430 
Lye Ala Pro Tyr Leu Ala Clu Thr Ala Leu Lys Phe Leu Tyr Thr Ser 
2750 2755 2760 

^ CAG CAC GGA ACA AAC TCT GAG ATA GAA GAG TAT TTA AAA GTG TTG TAT 8478 

Gin His Gly Thr Asn Ser Glu lie Glu Glu Tyr Leu Lys Val Leu Tyr 
2765 2770 2775 

GAT TAC GAT ATT CCA ACG ACT GAG AAT CTT TAT TTT CAG ACT GGC ACT 8526 
Asp Tyr Asp lie Pro Thr Thr Glu Asn Leu Tyr Phe Gin Ser Gly Thr 
2780 2785 2790 

GTG GAT GCT GGT GCt GAC GCT GGT AAG AAG AAA GAT CAA AAG GAT GAT 8574 
Val Asp Ala Gly Ala Aap Ala Gly Lys Lys Lys Asp Gin Lys Asp Asp 
2795 2800 2805 2810 

IS AAA GTC GCT GAG CAG GCT TCA AAG GAT AGG CAT GTT AAT GCT GGA ACT 8622 

Lys Val Ala Glu Gin Ala Ser Lys Asp Arg Asp Val Asn Ala Gly Thr 
2815 2620 2825 

TCA GGA ACA TTC TCA GTT CCA CGA ATA AAT GCT ATG GCC ACA AAA CTT 8670 
ser Gly Thr Phe Ser Val Pro Arg lie Asn Ala Met Ala Thr Lys Leu 
20 2630 2835 2840 

CAA TAT CCA AGG ATG AGG CGA GAG GTG GTT GTA AAC TTG AAT CAC CTT 8718 
Gin Tyr Pro Arg Met Arg Gly Glu Val Val Val Asn Leu Asn His Leu 
2845 2850 2855 

25 TTA GGA TAC AAG CCA CAG CAA ATT GAT TTG TCA AAT GCT CCA GCC ACA 8766 

Leu Gly Tyr Lys Pro Gin Gin lie Asp Leu Ser Asn Ala Arg Ala Thr 
2860 2865 2870 

CAT GAG CAG TTT GCC GCG TGG CAT CAG CCA GTG ATG ACA GCC TAT GGA 8814 
His Glu Gin Phe Ala Ala Trp His Gin Ala val Met Thr Ala Tyr Gly 
30 2875 2880 2885 2890 

GTG AAT GAA GAG CAA ATG AAA ATA TTG CTA AAT GGA TTT ATG GTG TGG 8862 
Val Asn Glu Glu Gin Met Lys He Leu Leu Asn Gly Phe Met val Trp 
2895 2900 2905 

35 TGC ATA GAA AAT CGC ACT TCC CCA AAT TTG AAC CCA ACT TGG GTT ATG 8910 

Cya He Glu Aen Gly Thr Ser Pro Aen Leu Asn Gly Thr Trp Val Met 
2910 2915 2920 

ATG GAT GGT GAG GAT CAA GTT TCA TAC CCG CTG AAA CCA ATG GTT GAA 8958 
Met Asp Gly Glu Asp Gin Val Ser Tyr Pro Leu Lys Pro Met Val Glu 
40 2925 2930 2935 

AAC GCG CAG CCA ACA CTG AGG CAA ATT ATG ACA CAC TTC ACT GAC CTG 9006 
Asn Ala Gin Pro Thr Leu Arg Gin He Met Thr His Phe Ser Asp Leu 
2940 2945 2950 

GCT GAA GCC TAT ATT GAG ATG AGG AAT AGG GAG CGA CCA TAC ATG CCT 9054 

Ala Glu Ala Tyr He Glu Met Arg Asn Arg Glu Arg Pro Tyr Met Pro 
2955 2960 2965 2970 

AGG TAT GGT CTA CAG AGA AAC ATT ACA GAC ATG ACT TTG TCA CGC TAT 9102 
Arg Tyr Gly Leu Gin Arg Asn He Thr Asp Met Ser Leu Ser Arg Tyr 
2975 2980 2985 

GCC TTC GAC TTC TAT GAG CTA ACT TCA AAA ACA CCT CTT AGA GCG AGG 9150 
Ala Phe Asp Phe Tyr Glu Leu Thr Ser Lys Thr Pro Val Arg Ala Arg 
2990 2995 3000 

GAG GCG CAT ATG CAA ATG AAA GCT GCT CCA GTA CGA AAC ACT GGA ACT 9198 
Glu Ala His Met Gin Met Lys Ala Ala Ala Val Arg Asn Ser Gly Thr 
3005 3010 3015 
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AGG TTA TTT GGT CTT GAT GGC AAC GTG GGT ACT GCA GAG GAA 6AC ACT 9246 
Arg Leu Phe Gly Leu hap Gly Asn Val Gly Thr Ala Glu Glu Asp Thr 
3020 3025 3030 

GAA CGG CAC ACA GCG CAC GAT GTG AAC CGT AAC ATG CAC ACA CTA TTA 9294 
Glu Arg His Thr Ala His Aep Val Asn Arg Asn Met His Thr Leu Leu 
3035 304Q 3045 3050 

GGG GTC CGC CAG TGA TAGTTTCTGC GTGTCTTTGC TTTCCGCTTT TAACCTTATT 9349 

Gly Val Arg Gin 

GTAATATATA TGAATAGCTA TTCACAGTGG GACTTGGTCT TGTGTTGAAT AGTATCTTAT 9409 

ATATTTTAAT ATGTCTTATT AGTCTCATTA CTTAGGCCAA CGACAAAGTG AGGTCACCTC 9469 

GGTCTAATTC TCCTATGTAG TGCGAG 9495 



20 



25 



30 



35 



40 



45 



50 



55 



(3) INFORMATION FOR SEQ ID NO: 2: 
(0 SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 792 

(B) TYPE: Nucleic Acid 

(C) STRANDEDNESS: Double 

(D) TOPOLOGY: Circular 

(ii) MOLECULE TYPE: cDNA to genomic RNA 

(Hi) HYPOTHETICAL: No 

(rv) ANTI-SENSE: No 

(V) FRAGMENT TYPE: N/A 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Tobacco Etch Virus 

(B) STRAIN: Highly Aphid Transmitted 

(C) INDIVIDUAL ISOLATE: N/A 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: No 

(B) CLONE: pTC:FL 

(viii) POSITION IN GENOME: N/A 
(Ix) FEATURE: 

(A) NAME/KEY: Mutations (AGT-ATG) introduced into nucleotides corresponding to genomic nucleotides 
851 8-8520 of SEQ ID No. 1 , to create initiating methionine codon. 
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(B) LOCATION: Nucleotides 1-3 of SEQ ID No. 2 

(C) IDENTIFICATION METHOD: - 

(D) OTHER INFORMATION: SEQ ID NO: 2 is the modified Tobacco Etch Virus coat protein gene present 
in pTC:FL. 

(X) PUBLICATION INFORMATION: 

(A) AUTHORS: Allison et al. 

(B) TITLE: The nucleotide sequence of the coding region of Tobacco Etch Virus Genomic RNA: Evidence 
for the Synthesis of a Single Polyprotein 

(C) JOURNAL: Virology 

(D) VOLUME: 154 

(E) ISSUE: - 

(F) PAGES: 9-20 

(A) AUTHORS: Lindbo and Dougherty 

(B) TITLE: Untranslatable Transcripts of the tobacco etch virus coat protein gene sequence can interfere 
with tobacco etch virus replication in Transgenic Plants and Protoplasts 

(C) JOURNAL: Virology 

(D) VOLUME: 189 

(E) ISSUE: - 

(F) PAGES: 725-733 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
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ATG GGC ACT 9 
He^ Gly Thr 
1 

5 GTG GAT GCT GCT GCT GAC GCT GGT AAG AAC AAA GAT CAA AAG GAT GAT 57 

Val Asp Ala Gly Ala Asp Ala Gly Lys Lys Lys Asp Gin Lys Aap Asp 

5 10 15 . 

AAA GTC GCT GAG CAG GCT TCA AAG GAT AGG GAT GTT AAT GCT GGA ACT 105 
Lys Val Ala Glu Gin Ala Ser Lys Asp Arg Asp Val Asn Ala Gly Thr 
10 20 25 30 35 

TCA GGA ACA TTC TCA GTT CCA CGA ATA AAT GCT ATG GCC ACA AAA CTT 153 
Ser Gly Thr Phe Ser Val Pro Arg He Asn Ala Met Ala Thr Lys Leu 
40 45 .50 

15 CAA TAT CCA AGG ATG AGG GGA GAG GTG GTT GTA AAC TTG AAT CAC CTT 201 

Gin Tyr Pro Arg Met Arg Gly Glu Val Val Val Asn Leu Asn His Leu 
55 60 65 

TTA GGA TAC AAG CCA CAG CAA ATT GAT TTG TCA AAT GCT CGA GCC ACA 249 
Leu Gly Tyr Lys Pro Gin Gin He Asp Leu Ser Asn Ala Arg Ala Thr 
70 75 80 

CAT GAG CAG TTT GCC GOG T6G CAT CAG GCA GTG ATG ACA GCC TAT GGA 297 
His Glu Gin Phe Ala Ala Trp His Gin Ala Val Met Thr Ala Tyr Gly 
85 90 95 

GTG AAT GAA GAG CAA ATG AAA ATA TTG CTA AAT GGA TTT ATG GTG TGG 345 
Val Asn Glu Glu Gin Met Lys He Leu Leu Asn Gly Phe Met Val Trp 
100 105 110 115 

TGC ATA GAA AAT GGG ACT TCC CCA AAT TTG AAC GGA ACT TGG GTT ATG 393 
30 Oys He Glu Asn Gly Thr Ser Pro Asn Leu Asn Gly Thr Trp Val Met 

120 125 130 



35 



45 



50 



55 
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ATG GAT GGT GAG GAT CAA GTT TCA TAC CCC CTC AAA CCA ATO GTT GAA 441 

Met Asp Gly Glu Asp Gin Val Ser Tyr Pro Leu Lya Pro Met Val Glu 
135 140 145 

AAC GCG CAG CCA ACA CTG AGG CAA ATT ATG ACA CAC TTC AGT GAC CTG 489 
Asn Ala Gin Pro Thr Leu Arg Gin lie Met Thr His Phe Ser Asp Leu 
ISO 155 160 

GCT GAA GCG TAT ATT GAG ATG AGG AAT AGG GAG CGA CCA TAC ATG CCT 537 
Ala Glu Ala Tyr lie Glu Met Arg Asn Arg Glu Arg Pro Tyr Met Pro 

165 170 175 

AGG TAT GGT CTA CAG AGA AAC ATT ACA GAC ATG AGT TTG TCA CGC TAT 585 
Arg Tyr Gly Leu Gin Arg Asn lie Thr Asp Met Ser Leu Ser Arg Tyr 
180 185 190 195 

GCG TTC GAC TTC TAT GAG CTA ACT TCA AAA ACA CCT GTT AGA GCG AGG 633 
Ala Phe Asp Phe Tyr Glu Leu Thr Ser Lys Thr Pro Val Arg Ala Arg 
200 205 210 

GAG GCG CAT ATG CAA ATG AAA GCT GCT GCA GTA CGA AAC AGT GGA ACT 681 
20 Glu Ala His Met Gin Met Lys Ala Ala Ala Val Arg Asn Ser Gly Thr 
215 . 220 225 

AGG TTA TTT GGT CTT GAT GGC AAC GTG GGT ACT GCA GAG GAA GAC ACT 729 
Arg Leu Phe Gly Leu Asp Gly Asn Val Gly Thr Ala Glu Glu Asp Thr 
230 235 240 

25 

GAA CGG CAC ACA GCG CAC GAT GTG AAC CGT AAC ATG CAC ACA CTA TTA 777 
Glu Arg His Thr Ala His Asp Val Asn Arg Asn Met His Thr Leu Leu 
245 250 255 

GGG GTC CGC CAG TGA 792 
Gly Val Arg Gin 
260 

(4) INFORMATION FOR SEQ ID NO: 3: 
35 (I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 793 

(B) TYPE: Nucleic Acid 

(C) STRANDEDNESS: Double 
40 (D) TOPOLOGY: Circular 

(ii) MOLECULE TYPE:.cDNAto genomic RNA . 

(iii) HYPOTHETICAL: No 

(iv) ANTI-SENSE: No 

45 (V) FRAGMENT TYPE: N/A 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Tobacco Etch Virus 

(B) STRAIN: Highly Aphid Transmitted 
50 (C) INDIVIDUAL ISOLATE: N/A 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: No 
55 (B) CLONE: pTC:RC 

(viii) POSITION IN GENOME: N/A 
(Ix) FEATURE: 
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(A) NAME/KEY: Mutation of AGT-GGC (Ser-Gly) to ATG-GCC (Met-Ser) 

(B) LOCATION: Nucleotides 1-6 of SEQ ID NO. 3 (corresponding to nucleotides 8518-8523 of SEQ ID 
NO. 1) 

.(A) NAME/KEY: Frameshlft mutation (insertion of T) producing stop codon 

(B) LOCATION: Nucleotide 13 of SEQ ID No, 3 (con-esponding to position between nucleotides 8529 and 
8530 of SEQ. ID No. 1) 

(D) OTHER INFORMATION: SEQ ID No: 3 is the modified Tobacco Etch Virus coat protein gene present 
in pTC:RC. 

(x) PUBLICATION INFORMATION: 

(A) AUTHORS: J. A. Lindbo and W. G. Dougherty 

(B) TITLE: Pathogen-Derived Resistance to a Potyvlrus: Immune and Resistant Phenotypes in Transgenic 
Tobacco Expressing Altered Forms of a Potyvlrus Coat Protein Nucleotide Sequence 

(C) JOURNAL: Molecular Plant-Microbe Interactions 

(D) VOLUME: 5 

(E) ISSUE: 2 

(F) PAGES: 144-153 

(A) AUTHORS: J. A. Lindbo and W. G. Dougherty 

(B) TITLE: Untranslatable Transcripts of the Tobacco Etch Virus Coat Protein Gene Sequence Can Inter- 
fere with Tobacco Etch Virus Replication in Transgenic Plants and Protoplasts 

(C) JOURNAL: Virology 

(D) VOLUME: 189 

(E) ISSUE: - 

(F) PAGES: 725-733 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATQ GCC ACT 9 

Met Ser Thr 

GTG TGA TGA TGGTGCTAGC GCTGGTAAGA AGAAAGATCA AAAGGATGAT 58 
Val 
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5 



' 15 



20 



25 



AAAGTCGCTG 


AGCAGGCTTC 


AAAGGATAGG 


GATGTTAAtCG CTGGAACTTC 


108 


AGGAACA7TC 


TCAGTTCCAC 


GAATAAATGC 


TATG6CCACA AAACT7CAAT 


158 




GAGGGGAGAG 


GT6GTTGTAA 


ACTTGAATCA CCTTTTAGGA 


208 




AGCAAATT6A 


T7TGTCAAAT 


GCTCGAGCCA CACATGAGCA 


258 


GTTTGCCGC6 


TGGCATCAGG 


CAG7GATGAC 


AGCC7ATGGA GTGAATGAAG 


308 


AGCM/LTGAA 


AATATTGCTA 


AAT6GATTTA 


TGGTGTGGTG CATA6AAAAT 


358 


GGGACTTCCC 


CAAATTTGAA 


CGGAACTTGG 


GTTATGATGG ATGGTGAGGA 


408 


TCAAGTTTCA 


TACCCGCTGA 


AAOCAATGGT 


TGAAAACGCG CAGCCAACAC 


458 


TGAGGCAAAT 


TATGACACAC 


TTCAGTGACC 


76GCT6AAGC G7ATATTGAG 


508 


ATGAGGAATA 


GGGAGCGACC 


ATACATGCCT 


AGGTATGG7C TACAGAGAAA 


558 


CATTACAGAC 


ATGAGTTTGT 


CACGCTATGC 


GT7CGACT7C 7ATGAGCTAA 


608 


CTTCAAAAAC 


ACCTGTTAGA 


GCGAGGGAGG 


CGCATATGCA AA76AAAGCT 


658 


GCTGCACTAC 


GAAACAG7GG 


AACTAGGTTA 


TTTGGTCTTG ATGGCAACGT 


708 


GGGTACTCCA 


GAGGAAGACA 


CTGAACGGCA 


CACAGCGCAC GATGTGAACC 


758 


GTAACA7GCA 


GACACTATXA 


GGGG7CCGCC 


AG7GA 


793 



(5) INFORMATION FOR SEQ ID NO: 4 

30 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 792 

(B) TYPE: Nucleic acid 

55 - (C) STRANDEDNESS: Double 

(D) TOPOLOGY: Circular 

(ii) MOLECULE TYPE: cDNA to genomic RNA 
(ill) HYPOTHETICAL: No 
40 (iv) ANTI-SENSE: Yes . . 

(V) FRAGMENT TYPE: N/A 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Tobacco Etch Virus 
^5 (B) STRAIN: Highly Aphid Transmitted 

(C) INDIVIDUAL ISOLATE: N/A 

(vii) IMMEDIATE SOURCE: 

50 (A) LIBRARY: No 

(B) CLONE: pTC:AS 

(vili) POSITION IN GENOME: N/A 
(ix) FEATURE: 

55 

(A) NAME/KEY: - 

(B) LOCATION: - 

(C) IDENTIFICATION METHOD: - 



31 



EP 0 626 998 B1 

(D) OTHER INFORMATION: SEQ ID No. 4 is the modified Tobacco Etch Virus Goat protein gene present 
in pTC:AS. It Is the inverse complement of SEQ ID No. 2. 

(X) PUBLICATION INFORMATION: 

5 

(A) AUTHORS: J. A. Llndbo and W. G. Dougherty 

(B) TITLE: Untranslatable Transcripts of the Tobacco Etch Virus Coat Protein Gene Sequence Can Inter- 
fere with Tobacco Etch Virus Replication in Transgenic Plants and Protoplasts 

10 

(C) JOURNAL: Virology 

(D) VOLUME: 189 
IS (E) ISSUE: - 

(F) PAGES: 725-733 

(A) AUTHORS: J, A. Lindbo and W. G. Dougherty 

20 

(B) TITLE: Pathogen-Derived Resistance to a Potyvirus: Immune and Resistant Phenotypes In Transgenic 
Tobacco Expressing Altered Fonns of a Potyvirus Coat Protein Nucleotide Sequence 

(C) JOURNAL: Molecular Plant-Microbe Interactions 

25 

(D) VOLUME: 5 

(E) ISSUE: 2 

30 (F) PAGES: 144-153 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



33 TCACTCCCCG ACCCCTAATA GT5TSTCCAT CTTACCCTTC ACATCCTSCG CTCTCTGCCG 60 

TTCACTGTCr rCCrCTCCAG TACCCACGTT GCCATCAACA CCAAATAACC TACTTCCACT 120 

GTTTCCTACT GCACCACcrT TCATrTGCAT ATCCCCCTCC CTCCCTCTAA CACCTCrrrr 180 

TCAAGTTAGC TCATAGAACT CCAACCCATA GCGTCACAAA CTCATGTCTG TAATCTTTCr 240 

40 

CTCTAGACCA TACCTAGCCA TCrATCGTCG CTCCCTATTC CTCATCTCAA XATACCCTTC 300 

ACCCACGTCA CTGAACTCrG TCATAATTTS CCTCACTCTT CGCTCCCCGT rTTCAACCAT 360 

TCCTTTCAGC CCGTATGAAA CTTSArCCTC ACCATCCXrc ATAACCCAAC TTCCCrrCAA 420 

45 

ATTTGGCGAA CTCCCATTrr CTATCCACCA CACCATAAAT CCATTTAGCA ATATrrrCAT 480 

TTGCTCrrCA rrCAcrCCAT ACCCrGrCAT CACTOCCTGA TGCCACCCCG CAAACTGCTC 540 

ATCTCTccrr ccAccArrrs acaaatcaat rrGcrcrGGC rrcTArcGTA AAACcrsArr eoo 

CAACTTTACA ACCACCrCTr CCCTrATCCT TGGATArTGA ACTTTTSTGC CCATAGCATT 560 

TATTCCrGGA ACTGACAATC TTrrTGAACT TCCACCArrA ACATCCCTAT CCrrTGAACC 720 

C7CCTCACCG AcrrTATCAr crrrrroATc TrrcrrrrrA ccAGCcrcAc cAccAccArc tbo 

55 ::acactcccc at 792 



32 



EP 0 626 998 B1 



10 



IS 



25 



30 




igdMcjgqj g^^^ comprising: 



transforming plant cells with a DNA molecule that encodes untranslatable plus-sense viral RNA molecule 
wherein the untranslatable tJJiiSjsens^ij^^g^^^ 
^^u^gjje; and 

regenerating a plant comprising the transfonned plant cell. 

2. A transgenic plant produced according to the method of Claim 1 . 

3. The method of Claim 1 wherein the untranslatable plus-sense viral RNA molecule is derived from a viral coat 
protein gene. 

4. The method of Claim 1 wherein the untranslatable plus-sense viral RNA molecule is derived from a potyvlrus. 



5. ^^^^^^^1^^^^^^^ ^of" producing virus resistant plants comprising a promoter operably linked to a DNA mol- 
^^H^^ec^i^^n untranslatable plus-sense viral RNA molecule, derived from the nucleotide sequence of a plant 

20 virus gene. 

6. The method of any one of Claims 1 , 3 or 4 wherein the untranslatable plus-sense viral RNA molecule contains at 
least one mutation that renders the RNA molecule untranslatable, and expression of the untranslatable plus-sense 
viral RNA molecule within the plant reduces the susceptibility of the plant to virus infection; 

and wherein the method further comprises the step of selecting a plant that shows a reduced susceptibility 
to infection by the virus. 



PatentansprQche 

1 . Ein Verfahren zur Herstellung einer Pflanze mit einer reduzierten Anfalligkeit fur eine virale Infektion, umfassend: 

Transfonmation von Pflanzenzellen mit einem DNA-Moiekul, das fur ein nicht-translatieriaares plus-strang vi- 
rales RNA-Molekul kodiert, dadurch gekennzeichnet, daB das nicht-translatierbare plus-strang virale RNA- 
Molekul von der Nukleotidsequenz elnes Pflanzenvirusgens abgeleitet ist; und 

Regeneration einer RIanze, beinhaltend die transfomnierte Pflanzenzelle. 

2. Eine transgene Pflanze, hergestellt entsprechend des Verfahrens nach Anspruch 1 . 

40 

3. Das Verfahren nach Anspnjch 1 , dadurch gekennzeichnet, daB das nicht-translatierbare plus-strang virale RNA- 
Molekul von einem viralen Hullproteingen abgeleitet ist. 

4. Das Verfahren nach Anspruch 1 , dadurch gekennzeichnet, daB das nicht-translatierbare plus-strang virale RNA- 
45 Molekul von einem Potyvirus abgeleitet ist. 

5. • Ein DNA-Molekul, verwendbar zur Herstellung virusreslstenter Planzen, umfassend einen Promoter, wirksam ver- 

bunden mit einem DNA-MolekQI, das fur ein nicht-translatlerbares plus-strang virales RNA-Molekul kodiert, wel- 
ches von der Nukleotidsequenz elnes Pflanzenvinjsgens abgeleitet Ist. 

50 

6. Das Verfahren nach einem der Anspruche 1 , 3 oder 4, dadurch gekennzeichnet, daB das nicht-translatierbare 
plus-strang virale RNA-Molekul mindestens eine Mutation enthalt, die das RNA-Molekul nicht-translatierbarmacht. 
und daB die Expression des nlcht-translatierbaren plus-strang viralen RNA-Molekuls in der Pflanze zu einer redu- 
zierten Anfalligkeit der Pflanze gegen Virusinfektlon fuhrt; 

S5 und, dadurch gekennzeichnet, daB das Verfahren weiterhin den Schritt der Selektion einer Pflanze umfaBt, die 

eine reduzierte Anfalligkeit fur eine Infektion durch das Virus zelgt. 
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Revendications 

1. M^thode de production d'une plante avec une sensibilitd r^duite k rinfection virale, comprenant ies 6tapes 
consistant : 

k transformer des cellules v6g6tales avec une molecule d'ADN qui code une molecule d'ARN viral sens plus 
non traduisible, la molecule d'ARN viral sens plus non tradulsible 6tant derivee de la sequence nucleotldique 
d'un gdne de virus de plante ; et 

k r§g6n6rer une plante comprenant la cellule v§g§tale transtorm^e. 

2. Plante transg6nique produite selon la m§thode de la revendlcatlon 1 . 

3. Methode selon la revendlcatlon 1 , dans laquelle la molecule d'ARN viral sens plus non tradulsible est d6riv6e d'un 
gene de prot^lne de capside viraie. 

4. Methode selon la revendlcatlon 1 , dans laquelle ia molteule d'ARN viral sens plus non tradulsible est derivee d'un 
potyvirus. 

5. Molecule d'ADN utile pour produire des plantes reslstantes a des virus, comprenant un promoteur lie de fapon 
'20 operationnelle k une molecule d'ADN codant une mol6cule d'ARN viral sens plus non traduisible, d6riv6e de la 

sequence nucleotldique d'un gene de virus de plante. 

6. Methode selon I'une quelconque des revendications 1 , 3 ou 4, dans laquelle la molecule d'ARN viral sens plus 
non traduisible contlent au molns une mutation, qui rend la molecule d'ARN non traduisible, et I'expresslon de la 

25 molecule d'ARN viral sens plus non traduisible dans la plante reduit la sensibillte de la plante a une infection virale ; 

et la methode comprenant en outre I'etape consistant k selectionner une plante qui montre une sensibilit6 r6dulte 
k une infection par le virus. 

30 
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NAAATAACAA ATCTCAACAC AACATATACA AAACAAACCA ATCTCAACCA ATCAAGCATT 60 

CTACTTCTAT TGCAGCAATT TAAATCATTT CTTTTAAACC AAAAGCAATT TTCTGAAAAT 120 

TTTCACCATT TACGAACGAT AGCA ATG GCA CTG ATC TTT GGC ACA GTC AAC GCT 174 

Met Ala Leu lie Phe Gly Thr Val Asn Ala 

15 10 

AAC ATC CTG AAG GAA GTG TTC GGT GGA GCT CGT ATG GCT TGC GTT ACC 222 

Asn He Leu Lys Glu Val Phe Gly Gly Ala Arg Met Ala Cys Val Thr 

IS 20 25 

AGC GCA CAT ATC GCT GCA GCG AAT GGA AGC ATT TTG AAG AAG GCA GAA 270 

Ser Ala His Met Ala Gly Ala Asn Gly Ser He Leu Lys Lys Ala Glu 

30 35 40 

GAG ACC TCT CGT GCA ATC ATG CAC AAA CCA GTG ATC TTC GCA GAA GAC 318 

Glu Thr Ser Arg Ala lie Met His Lys Pro Val He Phe Gly Glu Asp 

45 SO 55 

TAC ATT ACC GAG GCA GAC TTG CCT TAC ACA CCA CTC CAT TTA GAG GTC 366 

Tyr He Thr Glu Ala Asp Leu Pro Tyr Thr Pro Leu His Leu Glu Val 

60 65 70 

CAT CCT GAA ATG GAG CGG ATG TAT TAT CTT GGT CGT CGC GCG CTC ACC 414 

Asp Ala Glu Met Glu Arg Met Tyr Tyr Leu Gly Arg Arg Ala Leu Thr 

75 80 65 90 

CAT GGC AAG AGA CGC AAA GTT TCT CTC AAT AAC AAG AGG AAC AGG ACA 462 

His Gly Lys Arg Arg Lys Val Ser Val Asn Asn Lys Arg Asn Arg Arg 

95 100 105 

AGG AAA GTG GCC AAA ACG TAC GTG GGG CGT GAT TCC ATT GTT GAG AAG 510 

Arg Lys Val Ala Lys Thr Tyr Val Gly Arg Asp Ser He Val Glu Lys 

110 115 120 

ATT GTA GTG CCC CAC ACC GAC AGA AAG GTT GAT ACC ACA CCA CCA GTG 558 

He V&l Val Pro His Thr Glu Arg Lys Val Asp Thr Thr Ala Ala Val 

125 130 135 

GAA GAC ATT TGC AAT GAA GCT ACC ACT CAA CTT GTG CAT AAT ACT ATG 606 

Glu Asp He Cys Asn Glu Ala Thr Thr Gin Leu Val His Asn Ser Met 

140 145 150 

CCA AAG CGT AAG AAG CAG AAA AAC TTC TTG CCC GCC ACT TCA CTA ACT 654 

Pro Lys Arg Lys Lys Gin Lys Asn Phe Leu Pro Ala Thr Ser Leu Ser 

155 160 165 170 

AAC GTG TAT GCC CAA ACT TGG AGC ATA GTG CGC AAA CGC CAT ATG CAG 702 

Asn Val Tyr Ala Gin Thr Trp Ser He Val Arg Lys . Arg His Met -Gin 

175 180 185 

GTG GAG ATC ATT AGC AAG AAG AGC GTC CCA GCG AGG GTC AAG AGA TTT 750 

Val Glu He He Ser Lys Lys Ser Val Arg Ala Arg Val Lys Arg Phe 

190 195 200 

GAG GGC TCG GTG CAA TTG TTC GCA ACT GTG CGT CAC ATG TAT GGC GAG 79B 

Glu Gly Ser Val Gin Leu Phe Ala Ser Val Arg His Met Tyr Gly Glu 

205 210 215 

AGG AAA AGG GTG GAC TTA CGT ATT GAC AAC TGG CAG CAA GAG ACA CTT 846 

Arg Lys Arg Val Asp Leu Arg lie Asp Asn Trp Gin Gin Glu Thr Leu 

220 225 230 
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CTA GAC CTT GCT AAA AGA TTT AAG AAT GAG AGA GTG GAT CAA TCG AAG 894 

Leu Asp Leu Ala Lys Arg Phe Lys Asn Giu Arg Val Asp Gin Ser Lys 

235 240 245 250 

CTC ACT TTT GGT TCA AGT GGC CTA GTT TTG AGG CAA GGC TCG TAG GGA 942 

Leu Thr Phe Gly Ser Ser Gly Leu Val Leu Arg Gin Gly Ser Tyr Gly 

255 260 265 

CCT GCG CAT TGG TAT CGA CAT GGT ATG TTC ATT GTA CGC GGT CGG TCG 990 

Pro Ala His Trp Tyr Arg His Gly Met Phe lie Val Arg Gly Arg Ser 

270 275 2B0 

GAT GGG ATG TTG GTG GAT GCT CGT GCG AAG GTA ACG TTC GCT GTT TGT 1038 

Asp Gly Her Leu Val Asp Ala Arg Ala Lys Val Thr Phe Ala Val Cys 

285 290 295 

CAC TCA ATG ACA CAT TAT AGC GAC AAA TCA ATC TCT GAG GCA TTC TTC 1086 

His Ser Met Thr His Tyr Ser Asp Lys Ser lie Ser Clu Ala Phe Phe 

300 305 310 

ATA CCA TAC TCT AAG AAA TTC TTG GAG TTG AGA CCA GAT GGA ATC TCC 1134 

lie Pro Tyr Ser Lys Lys Phe Leu Glu Leu Arg Pro Asp Gly lie Ser 

315 320 325 330 

CAT GAG TGT ACA AGA GGA GTA TCA GTT GAG CGG TGG GGT GAG GTG GCT 1182 

His Glu Cys Thr Arg Gly Val Ser Val Giu Arg Cys Gly Glu Val Ala 

335 340 345 

CCA ATC CTC ACA CAA CCA CTT TCA CCG TGT GGT AAG ATC ACA TGC AAA 1230 

Ala lie Leu Thr Gin Ala Leu Ser Pro Cys Gly Lys lie Thr Cys Lys 

350 355 360 

CCT TGC ATG GTT GAA ACA CCT GAC ATT GTT GAG GGT GAG TCG GGA GAA 1278 

Arg Cys Met Val Glu Thr Pro Asp lie Val Glu Gly Glu Ser Gly Glu 

365 370 375 

AGT GTC ACC AAC CAA CGT AAG CTC CTA GCA ATG CTG AAA GAA CAG TAT 1326 

Ser Val Thr Asn Gin Gly Lys Leu Leu Ala Met Leu Lys Glu Gin Tyr 

380 385 390 

CCA GAT TTC CCA ATC GCC GAG AAA CTA CTC ACA AGG TTT TTG CAA CAG 1374 

Pro Asp Phe Pro Met Ala Glu Lys Leu Leu Thr Arg Phe Leu Gin Gin 

395 400 405 410 

AAA TCA CTA GTA AAT ACA AAT TTG ACA GCC TGC GTG AGC GTC AAA CAA 1422 

Lys Ser Leu Val Asn Thr Asn Leu Thr Ala Cys Val Ser Val Lys Gin 

415 420 425 

CTC ATT GGT GAC CGC AAA CAA GCT CCA TTC ACA CAC GTA CTG GCT GTC 1470 

Leu lie Gly Asp Arg Lys Gin Ala Pro Phe Thr His Val Leu, Ala Val 

430 435 440 

AGC GAA ATT CTC TTT AAA GGC AAT AAA CTA ACA GGC GCT GAT CTC GAA 1518 

Ser Glu lie Leu Phe Lys Gly Asn Lys Leu Thr Gly Ala Asp Leu Glu 

445 450 455 

GAG GCA AGC ACA CAT ATG CTT GAA ATA GCA AGG TTC TTG AAC AAT CGC 1566 

Glu Ala Ser Thr His Met Leu Glu lie Ala Arg Phe Leu Asn Asn Arg 

460 465 470 

ACT GAA AAT ATG CGC ATT GGC CAC CTT GGT TCT TTC AGA AAT AAA ATC 1614 

Thr Glu Asn Met Arg He Gly His Leu Gly Ser Phe Arg Asn Lvs He 

475 480 485 ' 490 
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TCA TCG AAC CCC CAT GTG AAT AAC GCA CTC ATG TGT GAT AAT CAA CTT 1662 

Ser Ser Lys Ala His Val Aan Asn Ala Leu Met Cys Aap Asn Gin Leu 

495 500 505 

GAT CAG AAT GCG AAT TTT ATT TGG GGA CTA AGG GGT GCA CAC GCA AAG 1710 

Asp Gin Asn Gly Asn Phe lie Trp Gly Leu Arg Gly Ala His Ala Lys 

510 515 520 

AGG TTT CTT AAA GGA TTT TTC ACT GAG ATT GAC CCA AAT GAA GCA TAC 1758 

Arg Phe Leu Lys Gly Phe Phe Thr Glu lie Asp Pro Aan Glu Gly Tyr 

525 530 535 

GAT AAG TAT GTT ATC AGG AAA CAT ATC AGG GGT AGC AGA AAG CTA GCA 1806 

Asp Lys Tyr Val lie Arg Lys His lie Arg Gly Ser Arg Lys Leu Ala 

5.40 545 550 

ATT GGC AAT TTG ATA ATG TCA ACT GAC TTC CAG ACG CTC AGG CAA CAA 1854 

lie Gly Asn Leu lie Met Ser Thr Asp Phe Gin Thr Leu Arg Gin Gin 
555 560 565 570 

ATT CAA GGC GAA ACT ATT GAG CGT AAA GAA ATT GGG AAT CAC TGC ATT 1902 

lie Gin Gly Glu Thr lie Glu Arg Lys Glu lie Gly Asn His Cys He 

575 580 585^ 

TCA ATG CCG AAT GGT AAT TAC GTG TAC CCA TGT TGT TGT GTT ACT CTT 1950 

Ser Met Arg Asn Cly Asn Tyr Val Tyr Pro Cys Cys Cys Val Thr Leu 

590 595 600 

CAA GAT GGT AAG GOT CAA TAT TCG GAT CTA AAG CAC CCA ACG AAG AGA 1998 

Glu Asp Gly Lys Ala Gin Tyr Ser Asp Leu Lys His Pro Thr Lys Arg 
605 610 615 

CAT CTG GTC ATT GGC AAC TCT GGC GAT TCA AAG TAC CTA GAC CTT CCA 2046 

His Leu Val He Gly Asn Ser Gly Asp Ser Lys Tyr Leu Asp Leu Pro 

620 625 630 

GTT CTC AAT GAA GAG AAA ATG TAT ATA GCT AAT GAA GGT TAT. TGC TAC 2094 

Val Leu Asn Glu Glu Lys Met Tyr He Ala Asn Glu Gly Tyr Cys Tyr 
635 640 645 650 

ATG AAC ATT TTC TTT GCT CTA CTA GTG AAT GTC AAG GAA GAG GAT GCA 2142 

Met Asn He Phe Phe Ala Leu Leu Val Asn Val Lys Glu Glu Asp Ala 

655 660 665 

AAG GAC TTC ACC AAG TTT ATA AGG GAC ACA ATT GTT CCA AAG CTT GGA 2190 

Lys Asp Phe Thr Lys Phe He Arg Asp Thr He Val Pro Lys Leu Gly 

670 675 680 

GCG TCG CCA ACA ATG CAA GAT GTT GCA ACT GCA TGC TAC TTA CTT TCC 2238 

Ala Trp Pro Thr Met Gin Asp Val Ala Thr Ala Cys Tyr Leu Leu Ser 
685 690 695 

ATT CTT TAC CCA GAT GTC CTG AGA GCT GAA CTA CCC AGA ATT TTG GTT 2286 

He Leu Tyr Pro Asp Val Leu Arg Ala Glu Leu Pro Arg He Leu Val 
700 705 710 

GAT CAT GAC AAC AAA ACA ATG CAT GTT TTG GAT TCC TAT GGG TCT AGA 2334 

Asp His Asp Asn Lys Thr Met His Val Leu Asp Ser Tyr Gly Ser Arg 
715 720 725 730 

ACG ACA GGA TAC CAC ATG TTG AAA ATG AAC ACA ACA TCC CAG CTA ATT 2382 

Thr Thr Gly Tyr His Met Leu Lys Met Asn Thr Thr Ser Gin Leu He 

735 740 745 
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GAA TTC GTT CAT TCA GGT TTG GAA TCC GAA ATG AAA ACT TAG AAT GTT 2430 

Glu Phe Vaa His Ser Gly Leu Ciu Ser Glu Met Lys Thr Tyr Asn Val 

750 755 760 

GGA GGG ATG AAC CGA GAT GTG GTC ACA CAA GGT GCA ATT GAG ATG TTG 2478 

Gly Gly Met Asn Arg Asp Val Val Thr Gin Gly Ala lie Glu Met Leu 

765 770 775 

ATC AAG TOT ATA TAG AAA CCA CAT CTC ATG AAG CAG TTA CTT GAG GAA 252 6 

lie Lys Ser lie Tyr Lys Pro His Leu Met Lys Gin Leu Leu Glu Glu 

780 785 790 

GAG CCA TAC ATA ATT GTC CTG GCA ATA GTC TCC CCT TCA ATT TTA ATT 2574 

Glu Pro Tyr lie lie Val Leu Ala He Val Ser Pro Ser He Leu He 
795 800 805 810 

GCC ATG TAC AAC TCT GCA ACT TTT GAG CAG GCG TTA CAA ATG TCG TTG 2622 

Ala Met Tyr Asn Ser Gly Thr Phe Glu Gin Ala Leu Gin Met Trp Leu 
815 820 825 

CCA AAT ACA ATG ACG TTA GCT AAC CTC GCT GCC ATC TTG TCA GCC TTA 2670 

Pro Asn Thr Met Arg Leu Ala Asn Leu Ala Ala He Leu Ser Ala Leu 

830 835 840 

GCG CAA AAG TTA ACT TTG GCA GAT TTG TTC GTC CAG CAG CGT AAT TTG 2718 

Ala Gin Lys Leu Thr Leu Ala Asp Leu Phe Val Gin Gin Arg Asn Leu 

845 850 855 

ATT AAT GAG TAT GCG CAG GTA ATT TTG GAC AAT CTG ATT GAC GGT GTC 2766 

He Asn Glu Tyr Ala Gin Val He Leu Asp Asn Leu He Asp Gly Val 

860 865 870 

ACG GTT AAT CAT TCG CTA TCC CTA GCA ATG GAA ATT GTT ACT ATT AAG 2814 

Arg Val Asn His Ser Leu Ser Leu Ala Met Glu He Val Thr He Lys 
875 880 885 890 

CTG GCC ACC CAA GAG ATG GAC ATG GCG TTG AGG GAA GGT GGC TAT GCT 2862 

Leu Ala Thr Gin Glu Met Asp Met Ala Leu Arg Glu Gly Gly Tyr Ala 
895 900 905 

GTG ACC TCT GAA AAG GTG CAT GAA ATG TTG GAA AAA AAC TAT GTA AAC 2910 

Val Thr Ser Glu Lys Val His Glu Met Leu Glu Lys Asn Tyr Val Lys 

910 915 920 

GCT TTG AAC GAT GCA TCC GAC CAA TTA ACT TGC TTG GAA AAA TTC TCC 2958 

Ala Leu Lys Asp Ala Trp Asp Glu Leu Thr Trp Leu Glu Lys Phe Ser 

925 930 935 

GCA ATC AGG CAT TCA AGA AAG CTC TTG AAA TTT GGG CGA AAG CCT TTA 3006 

Ala He Arg His Ser Arg Lys Leu Leu Lys Phe Gly Arg Lys Pro Leu 

940 945 950 

ATC ATG AAA AAC ACC GTA GAT TGC GGC GGA CAT ATA GAC TTG TCT GTG 3054 

He Met Lys Asn Thr Val Asp Cys Gly Gly His He Asp Leu Ser Val 
955 960 965 970 

AAA TCG CTT TTC AAG TTC CAC TTG GAA CTC CTG AAG GGA ACC ATC TCA 3102 

Lys Ser Leu Phe Lys Phe His Leu Glu Leu Leu Lys Gly Thr He Ser 
975 980 985 

AGA GCC GTA AAT GGT GGC GCA AGA AAG GTA AGA GTA GCG AAG AAT GCC 3150 

Arg Ala Val Asn Gly Gly Ala Arg Lys Val Arg Val Ala Lys Asn Ala 

990 995 1000 
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ATG ACA AAA GGG GTT TTT CTC AAA ATC TAG AGC ATG CTT CCT GAC GTC 3198 
Met Thr Lys Giy Val Phe Leu Lys lie Tyr Ser Met Leu Pro Asp Val 
1005 1010 1015 

TAG AAG TTT ATC ACA GTC TCG ACT GTC CTT TCC TTG TTG TTG ACA TTC 3246 
Tyr Lys Phe lie Thr Val Ser Ser Val Leu Ser Leu Leu Leu Thr Phe 
1020 1025 1030 

TTA TTT CAA ATT GAC TGC ATG ATA AGG GCA CAC CGA GAG GCG AAG GTT 3294 
Leu Phe Gin lie Asp Cys Met" lie Arg Ala His Arg Glu Ala Lys Val 
1035 1040 1045 1050 

GCT GCA CAG TTG CAG AAA GAG AGC GAG TGG GAC AAT ATC ATC AAT AGA 3342 
Ala Ala Gin Leu Gin Lys Glu Ser Glu Trp Asp Asn lie lie Asn Arg 
1055 1060 1065 

ACT TTC CAG TAT TCT AAG CTT GAA AAT CCT ATT GGG TAT CGC TCT ACA 3390 

Thr Phe Gin Tyr Ser Lys Leu Glu Asn Pro He Gly Tyr Arg Ser Thr 
1070 1075 1080 

GCG GAG GAA AGA CTC CAA TCA GAA CAC CCC GAG GCT TTC GAG TAC TAC 3438 
Ala Glu Glu Arg Leu Gin Ser Glu His Pro Glu Ala Phe Glu Tyr Tyr 
1085 1090 1095 

AAG TTT TGC ATT GGA AAG GAA GAC CTC GTT GAA CAG GCA AAA CAA CCG 3486 
Lys Phe Cys He Gly Lys Glu Asp Leu Val Glu Gin Ala Lys Gin Pro 
1100 1105 1110 

GAG ATA GCA TAC TTT GAA AAG ATT ATA GCT TTC ATC ACA CTT CTA TTA 3534 
Glu He Ala Tyr Phe Glu Lys He He Ala Phe lie Thr Leu Val Leu 
1115 1120 1125 1130 

ATG GCT TTT GAC GCT GAG CGG AGT GAT GGA GTG TTC AAG ATA CTC AAT 3582 
Met Ala Phe Asp Ala Glu Arg Ser Asp Gly Val Phe Lys He Leu Asn 
1135 1140 1145 

AAG TTC AAA GGA ATA CTG AGC TCA ACG GAG AGG GAG ATC ATC TAC ACG 3630 

Lys Phe Lys Gly He Leu Ser Ser Thr Glu Arg Glu He He Tyr Thr 
1150 1155 1160 

CAG AGT TTG GAT GAT TAC GTT ACA ACC TTT GAT GAC AAT ATG ACA ATC 3678 
Gin Ser Leu Asp Asp Tyr Val Thr Thr Phe Asp Asp Asn Met Thr He 
1165 1170 1175 

AAC CTC GAG TTG AAT ATG GAT GAA CTC CAC AAG ACG AGC CTT CCT GGA 3726 
Asn Leu Glu Leu Asn Met Asp Glu Leu His Lys Thr Ser Leu Pro Gly 
1180 1185 1190 

GTC ACT TTT AAG CAA TGG TGG AAC AAC CAA ATC AGC CGA GGC AAC GTG 3774 
Val Thr Phe Lys Gin Trp Trp Asn Asn Gin He Ser Arg Gly Asn Val 
1195 1200 1205 1210 

AAG CCA CAT TAT AGA ACT GAC GGG CAC TTC ATG GAG TTT ACC AGA GAT 3822 
Lys Prp His Tyr Arg Thr Glu Gly His Phe Met Glu Phe Thr Arg Aso 
1215 1220 1225 

ACT GCG GCA TCG GTT GCC AGC GAG ATA TCA CAC TCA CCC GCA AGA GAT 3870 
Thr Ala Ala Ser Val Ala Ser Glu He Ser His Ser Pro Ala Arg Asp 
1230 1235 1240 

TTT CTT GTG AGA GGT CCT GTT GGA TCT GGA AAA TCC ACA GGA CTT CCA 3918 
Phe Leu Val Arg Gly Ala Val Gly Ser Gly Lys Ser Thr Gly Leu Pro 
1245 1250 1255 
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TAC CAT TTA TCA AAC AGA GGG AGA GTG T7A ATG CTT GAG CCT ACC AGA 3966 
Tyr His Leu Ser Lys Arg Giy Arg Val Leu Met Leu Clu Pro Thr Arg 
1260 1265 1270 

CCA CTC ACA GAT AAC ATG CAC AAG CAA CTG AGA AGT GAA CCA TTT AAC 4014 
Pro Leu Thr Asp Asn Met His Lys Gin Leu Arg Ser Glu Pro Phe Asd 
1275 1280 1285 1290 

TGC TTC CCA ACT TTG AGC ATG. AGA GGG AAG TCA ACT TTT GGG TCA TCA 4062 
Cya Phe Pro Thr Leu Arg Met Arg Gly Lys Ser Thr Phe Gly Ser ser 

1295 1300 1305 

CCG ATC ACA GTC ATG ACT AGT GGA TTC GCT TTA CAC CAC TTT GCA CGA 4110 
Pro He Thr Val Met Thr Ser Gly Phe Ala Leu His His Phe Ala Arg 
1310 1315 1320 

AAC ATA GCT GAG CTA AAA ACA TAC GAT TTT GTC ATA ATT GAT GAA TCT 4158 
Asn He Ala Glu Val Lys Thr Tyr Asp Phe Val He He Asp Glu Cys 
1325 1330 1335 

CAT GTG AAT GAT GCT TCT GCT ATA GCG TTT AGG AAT CTA CTG TTT CAA 4206 
Hie Val Asn Asp Ala Ser Ala He Ala Phe Arg Asn Leu Leu Phe Glu 
1340 1345 1350 

CAT GAA TTT GAA GGA AAA GTC CTC AAA GTG TCA GCC ACA CCA CCA GGT 4254 
His Glu Phe Glu Gly Lys Val Leu Lys Val Ser Ala Thr Pro Pro Gly 
1355 1360 1365 1370 

AGA GAA GTT GAA TTT ACA ACT CAG TTT CCC GTG AAA CTC AAG ATA GAA 4302 
Arg Glu Val Glu Phe Thr Thr Gin Phe Pro Val Lys Leu Lys He Glu 
1375 1380 1385 

GAG CCT CTT AGC TTT CAG GAA TTT GTA AGT TTA CAA GGG ACA GGT GCC 4350 
Glu Ala Leu Ser Phe Gin Glu Phe Val Ser Leu Gin Gly Thr cly Ala 
1390 1395 1400 

AAC GCC GAT GTG ATT AGT TCT GCC GAC AAC ATA CTA GTA TAT GTT GCT 4398 
Asn Ala Asp Val He Ser Cys Gly Asp Asn He Leu Val Tyr Val Ala 
1405 1410 1415 

AGC TAC AAT GAT GTT GAT AGT CTT GGC AAG CTC CTT CTG. CAA AAG GGA 4446 
Ser Tyr Asn Asp Val Asp Ser Leu Gly Lys Leu Leu Val Gin Lys Gly 
1420 1425 1430 

TAC AAA GTG TCG AAG ATT GAT GGA AGA ACA ATG AAG AGT GGA GGA ACT 4494 
Tyr Lys Val Ser Lys He Asp Gly Arg Thr Met Lys Ser Gly Gly Thr 
1435 1440 1445 1450 

GAA ATA ATC ACT GAA GGT ACT TCA GTG AAA AAC CAT TTC ATA GTC GCA 4542 
Glu Ha He Thr Glu Gly Thr Ser Val Lys Lys His Phe He Val Ala 
1455 1460 1465 

ACT AAC ATT ATT GAG AAT GGT GTA ACC ATT GAC ATT GAT GTA GTT GTG 4590 
Thr Asn He He Glu Asn Gly Val Thr He Asp He Asp Val Val Val 
1470 1475 1460 

GAT TTT GGG ACT AAG GTT GTA CCA GTT TTG GAT GTG GAC AAT AGA GCG 4638 
Aap Phe Gly Thr Lys Val Val Pro Val Leu Asp Val Asp Asn Arg Ala 
1481 1490 1495 

GTG CAG TAC AAC AAA ACT GTG GTG AGT TAT GGG GAG CGC ATC CAA AAA 4686 
Val Gin Tyr Asn Lys Thr Val Val Ser Tyr Giy Glu Arg He Gin Lys 
1500 1505 1510 
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CTC GGT AGA GTT GGG CGA CAC AAG GAA GGA GTA GCA CTT CGA ATT GGC 4734 
Leu Gly Arg Val Gly Arg His Lys Glu Gly Val Ala Leu Arg He Gly 
1515 1520 1525 1530 

CAA ACA AAT AAA ACA CTG GTT GAA ATT CCA GAA ATG GTT GCC ACT GAA 4782 
Gin Thr Asn Lys Thr Leu Val Glu He Pro Glu Met Val Ala Thr Glu 
1535 1540 1545 

OCT GCC TTT CTA TGC TTC ATG. TAG AAT TTG CCA GTG ACA ACA CAC ACT 4830 
Ala Ala Phe Leu Cya Phe Met Tyr Asn Leu Pro Val Thr Thr Gin Ser 

1550 1555 1560 

CTT TCA ACC ACA CTG CTG GAA AAT GCC ACA TTA TTA CAA GCT AGA ACT 4878 
Val Ser Thr Thr Leu Leu Glu Asn Ala Thr Leu Leu Gin Ala Arg Thr 
1565 1570 1575 

ATG GCA CAG TTT GAG CTA TCA TAT TTT TAC ACA ATT AAT TTT GTG CGA 4926 
Met Ala Gin Phe Glu Leu Ser Tyr Phe Tyr Thr He Asn Phe Val Arg 
1580 1585 1590 

TTT GAT GGT AGT ATG CAT CCA GTC ATA CAT 6AC AAG CTG AAG C6C TTT 4974 
Phe Asp Gly Ser Met His Pro Val He His Asp Lys Leu Lys Arg Phe 
1595 1600 1605 1610 

AAG CTA CAC ACT TGT GAG ACA TTC CTC AAT AAG TTG GCG ATC CCA AAT 5022 
Lys Leu His Thr Cys Glu Thr Phe Leu Asn Lys Leu Ala He Pro Asn 
1615 1620 1625 

AAA GGC TTA TCC TCT TGG CTT ACG AGT GGA GAG TAT AAG CGA CTT GGT 5070 
Lys Gly Leu Ser Ser Trp Leu Thr Ser Gly Glu Tyr Lys Arg Leu Gly 
1630 1635 1640 

TAC ATA GCA GAG GAT GCT GGC ATA AGA ATC CCA TTC GTG TGC AAA GAA 5118 
Tyr He Ala Glu Asp Ala Gly He Arg He Pro Phe Val Cys Lys Glu 
1645 1650 1655 

ATT CCA CAC TCC TTG CAT GAG GAA ATT TCC CAC ATT GTA GTC GCC CAT 5166 
He Pro Asp Ser Leu His Glu Glu He Trp His He Val Val Ala His 
1660 1665 1670 

AAA GOT GAC TCG GGT ATT GGG ACG CTC ACT AGC GTA CAG GCA GCA AAG 5214 
Lys Gly Asp Ser Gly He Gly Arg Leu Thr Ser Val Gin Ala Ala Lys 
1675 1680 1685 1690 

CTT GTT TAT ACT CTG CAA ACG GAT GTG CAC TCA ATT GCG AGG ACT CTA 5262 
Val Val Tyr Thr Leu Gin Thr Asp Val His Ser He Ala Arg Thr Leu 
1695 1700 1705 

GCA TGC ATC AAT AGA CGC ATA GCA GAT GAA CAA ATG AAG CAG AGT CAT 5310 

Ala Cys He Asn Arg Arg He Ala Asp Glu Gin Met Lys Gin Ser His 
1710 1715 1720 

TTT GAA GCC GCA ACT GGG AGA GCA TTT TCC TTC ACA AAT TAC TCA ATA 5358 
Phe Glu Ala Ala Thr Gly Arg Ala Phe Ser Phe Thr Asn Tyr Ser He 
1725 1730 1735 

CAA AGC ATA TTT GAC ACG CTG AAA GCA AAT TAT GCT ACA AAG CAT ACG 5406 
Gin Ser He Phe Asp Thr Leu Lys Ala Asn Tyr Ala Thr Lys His Thr 
1740 1745 1750 

AAA GAA AAT ATT GCA GTG CTT CAG CAG GCA AAA GAT CAA TTG CTA GAG 5454 
Lys Glu Asn He Ala Val Leu Gin Gin Ala Lys Asp Gin Leu Leu Glu 
1755 1760 1765 1770 
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TTT TCG AAC CTA GCA AAG GAT CAA GAT GTC ACG GGT ATC ATC CAA GAG 
Phe Ser Asn Leu Ala Lys Asp Gin Asp Val Thr Gly lie lie Gin Asp 
1775 1780 1785 



5502 



TTC AAT CAC CTG GAA ACT ATC TAT CTC CAA TCA GAT AGC GAA GTG GCT 
Phe Asn His Leu Glu Thr lie Tyr Leu Gin Ser Asp Ser Giu Val Ala 
1790 1795 IBOO 



5550 



AAG CAT CTG AAG CTT AAA ACT CAC TGG AAT AAA AGC CAA ATC ACT AGG 
Lya His Leu Lys Leu Lys Ser His Trp Asn Lys Ser Gin lie Thr Arg 
1805 1810 1815 



5598 



GAC ATC ATA ATA GCT TTG TCT GTG TTA ATT GGT GGT GGA TGG ATG CTT 
Asp lie He He Ala Leu Ser Val Leu He Gly Gly Gly Trp Met Leu 
1820 1825 1830 



5646 



GCA ACG TAC TTC AAC GAC AAG TTC AAT GAA CCA GTC TAT TTC CAA GGG 
Ala Thr Tyr Phe Lys Asp Lys Phe Asn Glu Pro Val Tyr Phe Gin Gly 
1835 1840 1845 1850 



5694 



AAG AAG AAT GAG AAG CAC AAG CTT AAG ATG AGA GAG GCG CGT GGG GCT 
Lya Lys Asn Gin Lys His Lys Leu Lys Met Arg Glu Ala Arg Gly Ala 
1855 1860 1865 



5742 



AGA GGG CAA TAT GAG GTT GCA GCG GAG CCA GAG GCG CTA GAA CAT TAC 
Arg Gly Gin Tyr Glu Val Ala Ala Glu Pro Glu Ala Leu Glu His Tyr 
1870 1875 1880 



5790 



TTT GGA AGC GCA TAT AAT AAC AAA GGA AAG CGC AAG GGC ACC ACG AGA 
Phe Gly Ser Ala Tyr Asn Asn Lys Gly Lys Arg Lys Gly Thr Thr Arg 
1885 1890 1895 



5838 



GGA ATG GGT GCA AAG TCT CGG AAA TTC ATA AAC ATG TAT GGG TTT GAT 
Gly Met Gly Ala Lys Ser Arg Lys Phe He Asn Met Tyr Gly Phe Asp 
1900 1905 1910 



5886 



CCA ACT CAT TTT TCA TAC ATT AGG TTT GTG GAT CCA TTG ACA GCT CAC 
Pro Thr Asp Phe Ser Tyr He Arg Phe Val Asp Pro Leu Thr Gly His 
1915 1920 1925 1930 



5934 



ACT ATT GAT GAG TCC ACA AAC GCA CCT ATT GAT TTA GTG CAG CAT GAG 
Thr He Asp Glu Ser Thr Asn Ala Pro He Asp Leu Val Gin His Glu 
1935 1940 1945 



5982 



TTT GGA AAG GTT ACA ACA CGC ATG TTA ATT GAC GAT GAC ATA GAC CCT 
Phe Gly Lys Val Arg Thr Arg Met Leu He Asp Asp Giu He Glu Pro 
1950 1955 1960 



6030 



CAA AGT CTT AGC ACC CAC ACC ACA ATC CAT GCT TAT TTG GTG AAT ACT 
Gin Ser Leu Ser Thr His Thr Thr He His Ala Tyr Leu Val Asn" Ser 
1965 1970 1975 



6078 



GGC ACG AAG AAA GTT CTT AAG GTT GAT TTA ACA CCA CAC TCG TCG CTA 
Gly Thr Lys Lys Val Leu Lys Val Asp Leu Thr Pro His Ser Ser Leu 
1980 1985 1990 



6126 



CGT GCG AGT GAG AAA TCA ACA GCA ATA ATG GGA TTT CCT GAA AGG GAG 
Arg Ala Ser Glu Lys Ser Thr Ala He Met Gly Phe Pro Glu Arg Glu 
1995 2000 2005 2010 



6174 



AAT GAA TTG CGT CAA ACC GGC ATG GCA GTG CCA GTC GCT TAT GAT CAA 
Asn Glu Leu Arg Gin Thr Gly Met Ala Val Pro Val Ala Tyr Asp Gin 
2015 2020 2025 



6222 
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TTG CCA CCA AAG AAT GAG GAC TTG ACG TTT GAA GGA GAA AGC TTG TTT 6270 

Leu Pro Pro Lys Asn Glu Asp Leu Thr Phe Glu Giy Glu Ser Leu Phe 
2030 2035 2040 

AAG GGA CCA CGT GAT TAG AAC CCG ATA TCG AGC ACC ATT TGT CAT TTG 6318 

Lys Giy Pro Arg Asp Tyr Asn Pro lie Ser Ser Thr lie Cys His Leu 

2045 2050 2055 

ACG AAT GAA TCT GAT GGG CAC ACA ACA TCG TTG TAT GGT ATT GGA TTT 6366 

Thr Asn Glu Ser Asp Giy His Thr Thr Ser Leu Tyr Giy lie Giy Phe 

2060 2065 2070 

GGT CCC TTC ATC ATT ACA AAC AAG CAC TTG TTT AGA AG A AAT AAT GGA 6414 

Giy Pro Phe He He Thr Asn Lys His Leu Phe Arg Arg Asn Asn Giy 
2075 20BO 2085 2090 

ACA CTG TTG GTC CAA TCA CTA CAT GGT GTA TTC AAG GTC AAG AAC ACC 6462 

Thr Leu Leu Val Gin Ser Leu His Giy Val Phe Lys Val Lys Asn Thr 
2095 2100 2105 

ACG ACT TTG CAA CAA CAC CTC ATT GAT GGG AGG GAC ATG ATA ATT ATT 6510 

Thr Thr Leu Gin Gin His Leu He Asp Giy Arg Asp Met He He He 
2110 2115 2120 

CGC ATG CCT AAG GAT TTC CCA CCA TTT CCT CAA AAG CTG AAA TTT AGA 6558 

Arg Met Pro Lys Asp Phe Pro Pro Phe Pro Gin Lys Leu Lys Phe Arg 

2125 2130 2135 

CAC CCA CAA ACG CAA GAG CGC ATA TGT CTT GTG ACA ACC AAC TTC CAA 6606 

Glu Pro Gin Arg Glu Glu Arg He Cys Leu Val Thr Thr Asn Phe Gin 

2140 2145 2150 

ACT AAG AGC ATC TCT AGC ATG GTG TCA GAC ACT ACT TGC ACA TTC CCT 6654 

Thr Lys Ser Met Ser Ser Met val ser Asp Thr ser Cys Thr Phe Pro 
2155 2160 2165 2170 

TCA TCT GAT GCC ATA TTC TCG AAG CAT TCG ATT CAA ACC AAG GAT GGG 6702 

Ser Ser Asp Giy He Phe Trp Lys His Trp He Gin Thr Lys Asp Giy 
2175 2180 2185 

GAG TGT GGC AGT CCA TTA GTA TCA ACT AGA GAT GGG TTC ATT GTT GGT 6750 

Gin Cys Giy Ser Pro Leu Val Ser Thr Arg Asp Giy Phe He Val Giy 
2190 2195 2200 

ATA CAC TCA GCA TCG AAT TTC ACC AAC ACA AAC AAT TAT TTC ACA AGC 6798 

He His ser Ala Ser Asn Phe Thr Asn Thr Asn Asn Tyr Phe Thr Ser 

2205 2210 2215 

GTG CCG AAA AAC TTC ATG GAA TTG TTG ACA AAT CAG GAG GCG CAG GAG 6846 

Val Pro Lys Asn Phe Met Glu Leu Leu Thr Asn Gin Glu Ala Gin Gin 

2220 2225 2230 

TCG GTT AGT GGT TGG CGA TTA AAT GCT GAC TCA GTA TTG TGG GGG GGC 6894 

Trp Val Ser Giy Trp Arg Leu Asn Ala Asp Ser Val Leu Trp Giy Giy 
2235 2240 2245 2250 

CAT AAA GTT TTC ATG AGC AAA CCT GAA GAG CCT TTT CAG CCA GTT AAG 6942 

His Lys Val Phe Met Ser Lys Pro Glu Glu Pro Phe Gin Pro Val Lys 
2255 2260 2265 

GAA GCG ACT CAA CTC ATG AAT GAA TTG GTG TAC TCG CAA GGG GAG AAG 6990 

Glu Ala Thr Gin Leu Met Asn Glu Leu Val Tyr Ser Gin Giy Glu Lys 
2270 2275 2280 
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AGG AAA TGG GTC GTG GAA GCA CTG TCA GGG AAC TTG AGG CCA GTG GCT 7038 
Arg Lya Trp Val Val Glu Ala Leu Ser Gly Asn Leu Arg Pro Val Ala 
2285 2290 2295 

GAG TGT CCC ACT GAG TTA GTC ACA AAG CAT GTG GTT AAA GGA AAG TGT 7086 
Glu Cya Pro ser Gin Leu Val Thr Lys His Val Val Lys Gly Lys Cys 
2300 2305 2310 

CCC CTC TTT GAG CTC TAG TTG CAG TTG AAT CCA GAA AAG GAA GCA TAT 7134 
Pro Leu Phe Glu Leu Tyr Leu Gin Leu Asn Pro Glu Lys Glu Ala Tyr 
2315 2320 2325 2330 

TTT AAA CCG ATG ATG GGA GCA TAT AAG CCA ACT CGA CTT AAT AGA GAG 7182 
Phe Lys Pro Met Met Gly Ala Tyr Lys Pro Ser Arg Leu Asn Arg Glu 
2335 2340 2345 

CCG TTC CTC AAG GAC ATT CTA AAA TAT GCT AGT GAA ATT GAG ATT GGG 7230 
Ala Phe Leu Lya Asp He Leu Lys Tyr Ala Ser Glu lie Glu He Gly 
2350 2355 2360 

AAT GTG. GAT TGT GAC TTG CTG GAG CTT GCA ATA AGC ATG CTC GTC ACA 727 B 

Asn Val Asp Cys. Asp Leu Leu Glu Leu Ala He Ser Met Leu Val Thr 
2365 2370 2375 

AAG CTC AAG GGG TTA GGA TTC CCA ACT GTG AAC TAC ATC ACT GAC CCA 7326 
Lys Leu Lya Ala Leu Gly Phe Pro Thr Val Asn Tyr He Thr Asp Pro 
2380 2385 2390 

GAG GAA ATT TTT AGT GCA TTG AAT ATG AAA GCA GCT ATG GGA GCA CTA 7374 
Glu Glu He Phe Ser Ala Leu Asn Met Lys Ala Ala Met Gly Ala Leu 
2395 2400 2405 2410 

TAC AAA GGC AAG AAG AAA GAA GCT CTC AGC GAG CTC ACA CTA GAT GAG 7422 
Tyr Lye Gly Lya Lys Lye Glu Ala Leu Ser Glu Leu Thr Leu Asp Glu 
2415 2420 2425 

CAG GAG GCA ATG CTC AAA GCA AGT TGC CTG CGA CTG TAT ACG GGA AAG 7470 
Gin Glu Ala Met Leu Lys Ala Ser Cys Leu Arg Leu Tyr Thr Gly Lys 
2430 2435 2440 

TTG GGA ATT TGC AAT GGC TCA TTG AAA GCA GAG TTG CGT CCA ATT GAC 7518 
Leu Gly He Trp Asn Gly Ser Leu Lys Ala Glu Leu Arg Pro He Glu 
2445 2450 2455. 

AAG GTT GAA AAC AAC AAA ACG CGA ACT TTC ACA GCA GCA CCA ATA GAC 7566 
Lys Val Glu Asn Asn Lys Thr Arg Thr Phe Thr Ala Ala Pro He Asp 
2460 2465 2470 

ACT CTT CTT GCT GGT AAA GTT TGC GTG GAT GAT TTC AAC AAT CAA TTT 7614 
Thr Leu Leu Ala Gly Lys Val Cys Val Asp Asp Phe Asn AGn Gin Phe 
2475 2480 2485 2490 

TAT GAT CTC AAC ATA AAG GCA CCA TGG ACA GTT GGT ATG ACT AAG TTT 7662 
Tyr Asp Leu Asn He Lys Ala Pro Trp Thr Val Gly Met Thr Lys Phe 
2495 2500 2505 

TAT CAG GGG TGG AAT GAA TTG ATG GAG GCT TTA CCA AGT GGG TGG GTG 7710 
Tyr Gin Gly Trp Asn Glu Leu Met Glu Ala Leu Pro Ser Gly Trp Val 
2510 2515 2520 

TAT TGT GAC GCT GAT GGT TCG CAA TTC GAC AGT TCC TTG ACT CCA TTC 7758 
Tyr Cys Asp Ala Asp Gly Ser Gin Phe Asp Ser Ser Leu Thr Pro Phe 
2525 2530 2535 
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CTC ATT AAT GCT GTA TTG AAA GTG CGA CTT GCC TTC ATG GAG GAA TGG 7806 
Leu He Asn Ala Vai Leu Lys Val Arg Leu Ala Phe Met Giu GLu Trp 
2540 2545 2550 

GAT ATT GGT GAG CAA ATG CTG CGA AAT TTG TAG ACT GAG ATA GTG TAT 7854 
Asp He Gly Glu Gin Met Leu Arg Asn Leu Tyr Thr Glu He Val Tyr 
2555 2560 2565 2570 

ACA CCA ATC CTC ACA CCG GAT GGT ACT ATC ATT AAG AAG CAT AAA GGC 7902 
Thr Pro He Leu Thr Pro Asp Gly Thr He He Lys Lys His Lys Gly 
2575 2580 2585 

AAC AAT ACC GGG CAA CCT TCA ACA GTG GTG GAC AAC ACA CTC ATG GTC ' 7950 
Asn Asn Ser Gly Gin Pro Ser Thr Val Val Asp Asn Thr Leu Met Val 
2590 2595 2600 

ATT ATT GCA ATC TTA TAC ACA TGT GAC AAG TCT GGA ATC AAC AAG GAA 7998 

lie He Ala Met Leu Tyr Thr Cys Glu Lys Cys Gly He Asn Lya Glu 

2605 2610 2615 

GAG ATT. GTG TAT TAC GTC AAT GGC CAT GAC CTA TTG ATT GCC ATT CAC 8046 
Glu He Val Tyr. Tyr Vai Asn Gly Asp Asp Leu Leu He Ala He His 
2620 2625 2630 

CCA GAT AAA GCT GAG AGG TTG AGT AGA TTC AAA GAA TCT TTC GGA GAG 8094 
Pro Asp Lys Ala Glu Arg Leu Ser Arg Phe Lys Glu Ser Phe Gly Glu 
2635 2640 2645 2650 

TTG GGC CTG AAA TAT GAA TTT GAC TGT ACC ACC AGG GAC AAG ACA CAG 8142 
Leu Gly Leu Lys Tyr Glu Phe Asp Cys Thr Thr Arg Asp Lys Thr Gin 
2655 2660 2665 

TTG TGG TTC ATC TCA CAC AGG GCT TTG GAG AGG GAT GGC ATG TAT ATA 8190 
Leu Trp Phe Met Ser His Arg Ala Leu Glu Arg Asp Gly Met Tyr He 
2670 2675 2680 

CCA AAG CTA GAA GAA GAA AGG ATT GTT TCT ATT TTG GAA TGG GAC AGA 8238 
Pro Lys Leu Glu Glu Glu Arg He Val Ser He Leu Glu Trp Asp Arg 
2685 2690 2695 

TCC AAA CAG CCG TCA CAT AGG CTT GAA GCC ATC TGT GCA TCA ATC ATT 8286 

Ser Lya Glu Pro Ser His Arg Leu Glu Ala He Cys Ala Ser Met He 
2700 2705 2710 

GAA GCA TGG GGT TAT GAC AAG CTG GTT GAA GAA ATC CGC AAT TTC TAT 8334 
Glu Ala Trp Gly Tyr Asp Lys Leu Val Glu Glu He Arg Asn Phe Tyr 
2715 2720 2725 2730 

GCA TGG GTT TTG GAA CAA GCG CCG TAT TCA CAG CTT GCA GAA GAA GGA 8382 
Ala Trp Val Leu Glu Gin Ala Pro Tyr Ser Gin Leu Ala Glu Glu Gly 
2735 2740 2745 

AAG GCG CCA TAT CTG GCT GAG ACT GCG CTT AAG TTT TTG TAC ACA TCT 8430 
Lys Aia Pro Tyr Leu Ala Glu Thr Ala Leu Lys Phe Leu Tyr Thr Ser 
2750 2755 2760 

CAG CAC GGA ACA AAC TCT GAG ATA GAA GAG TAT TTA AAA GTG TTG TAT 8478 
Gin His Gly Thr Asn Ser Glu He Glu Glu Tyr Leu Lys Val Leu Tyr 
2765 2770 2775 

GAT TAC GAT ATT CCA ACC ACT GAG AAT CTT TAT TTT CAG AGT GGC ACT 8526 
Asp Tyr Asp He Pro Thr Thr Glu Asn Leu Tyr Phe Gin Ser Gly Thr 
2780 2785 2790 
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GTG GAT GCT GGT GCT GAC OCT GOT AAG AAG AAA GAT CAA AAG GAT GAT B574 
Val Asp Ala Gly Ala Asp Ala Gly Lys Lys Lys Asp Gin Lys Asp Asp 
2795 2800 2805 2810 

AAA GTC GCT GAG CAG GCT TCA AAG GAT AGG GAT GTT AAT GCT GGA ACT 8622 
Lys Val Ala Glu Gin Ala Ser Lys Asp Arg Asp Val Asn Ala Gly Thr 
2815 2B20 2825 

TCA GGA ACA TTC TCA GTT CCA CGA ATA AAT GCT ATG GCC ACA AAA CTT B670 
Ser Gly Thr Phe Ser Val Pro Arg lie Asn Ala Met Ala Thr Lys Leu 
2830 2835 2840 

CAA TAT CCA AGG ATG AGG GGA GAG GTG GTT GTA AAC TTG AAT CAC CTT 8718 
Gin Tyr Pro Arg Met Arg Gly Glu Val Val Val Asn Leu Asn His Leu 
2845 2850 • 2855 

TTA GGA TAC AAG CCA CAG CAA ATT GAT TTG TCA AAT GCT CGA GCC ACA 8766 
Leu Gly Tyr Lys Pro Gin Gin lie Asp Leu Ser Asn Ala Arg Ala Thr 
2860 2865 2870 

CAT GAG CAG TTT GCC GCG TGG CAT CAG GCA GTC ATG ACA GCC TAT GGA 8814 
His Glu Gin Phe Ala Ala Trp His Gin Ala Val Met Thr Ala Tyr Gly 
2875 2880 2885 2890 

GTG AAT GAA GAG CAA ATG AAA ATA TTG CTA AAT CGA TTT ATG GTG TGG 8862 
Val Asn Glu Glu Gin Met Lys lie Leu Leu Asn Gly Phe Met Val Trp 
2895 2900 2905 

TGC ATA GAA AAT GCG ACT TCC CCA AAT TTG AAC GGA ACT TGG GTT ATG 8910 
Cya He Glu Asn Gly Thr Ser Pro Asn Leu Asn Gly Thr Trp Val Met 
2910 2915 2920 

ATG GAT GGT GAG GAT CAA GTT TCA TAC CCG CTG AAA CCA ATG GTT GAA 8958 
Met Asp Gly Glu Asp Gin Val Ser Tyr Pro Leu Lys Pro Met Val Glu 
2925 2930 2935 

AAC GCG CAG CCA ACA CTG AGG CAA ATT ATG ACA CAC TTC AGT GAC CTG 9006 
Aan Ala Gin Pro Thr Leu Arg Cln He Met Thr His Phe Ser Asp Leu 
2940 2945 2950 

GCT GAA GCG TAT ATT GAG ATG AGG AAT AGG GAG CGA CCA TAC ATG CCT 9054 
Ala Glu Ala Tyr He Glu Met Arg Asn Arg Glu Arg Pro Tyr Met Pro 
2955 2960 2965 2970 

AGG TAT GGT CTA CAG AGA AAC ATT ACA GAC ATG AGT TTG TCA CGC TAT 9102 
Arg Tyr Gly Leu Gin Arg Asn He Thr Asp Met Ser Leu ser Arg Tyr 
2975 2980 2985 

GCC TTC GAC TTC TAT GAG. CTA- ACT TCA AAA ACA CCT GTT AGA GCG AGG 9150 
Ala Phe Asp Phe Tyr Glu Leu Thr Ser Lys Thr Pro Val Arg Ala Arg 
2990 2995 3000 

GAG GCG CAT ATG CAA ATG AAA GCT GCT GCA GTA CGA AAC AGT GGA ACT 9198 
Glu Ala His Met Gin Met Lys Ala Ala Ala Val Arg Asn Ser Gly Thr 
3005 3010 3015 

AGG TTA TTT GGT CTT GAT GGC AAC GTG GGT ACT GCA GAG GAA GAC ACT 9246 
Arg Leu Phe Gly Leu Asp Gly Asn Val Gly Thr Ala Glu Glu Asp Thr 
3020 3025 3030 

GAA CGG CAC ACA GCG CAC GAT GTG AAC CGT AAC ATG CAC ACA CTA TTA 9294 
Glu Arg His Thr Ala His Asp Val Asn Arg Asn Met His Thr Leu Leu 
3035 ' 3040 3045 3050 
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GGG GTC CGC CAG TGA TAGTTTCTGC GTGTCTTTGC TTTCCGCTTT TAAGCTTATT 9349 
Gly val Arg Gin 

GTAATATATA TGAATAGCTA TTCACAGTGC GACTTGGTCT TGTGTTGAAT AGTATCTTAT 9409 

ATATTTTAAT ATCTCTTATT ACTCTCATTA CTTAGGCGAA CGACAAAGTG AGGTCACCTC 9469 

CGTCTAATTC TCCTATGTAG TGCGAG 9495 
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